山东大学学报(理学版) ›› 2016, Vol. 51 ›› Issue (3): 104-110.doi: 10.6040/j.issn.1671-9352.1.2015.025
李智恒,杨志豪,林鸿飞*
LI Zhi-heng, YANG Zhi-hao, LIN Hong-fei*
摘要: 随着人类基因组学研究和高通量技术的发展,涉及蛋白质知识以及相关疾病、药物的医学文献呈指数增长。利用文本挖掘技术从大量的生物医学文本中发现和抽取有价值的、新颖的蛋白质知识已经成为可能。基于SemRep得到的特定疾病的MEDLINE文献的语义输出,通过显著信息提取算法对该语义输出进行打分排序,抽取得到与特定疾病相关的蛋白质以及蛋白质和药物之间的联系。之后与KEGG数据库中列出的该疾病相关的蛋白质、基因进行比较。实验结果对理解疾病的病因、蛋白质功能预测以及药物辅助设计都有重要的研究意义。
中图分类号:
[1] GOLDER S, MCINTOSH H M, DUFFY S, et al. Developing efficient search strategies to identify reports of adverse effects in MEDLINE and EMBASE[J]. Health Information & Libraries Journal, 2006, 23(1):3-12. [2] KILICOGLU H, FISZMAN M, RODRIGUEZ A, et al. Semantic MEDLINE: a web application for managing the results of PubMed Searches[C] // Proceedings of the Third international Symposium for Semantic Mining in Biomedicine, 2008, 2008:69-76. [3] TSURUOKA Y, MIWA M, HAMAMOTO K, et al. Discovering and visualizing indirect associations between biomedical concepts[J]. Bioinformatics, 2011, 27(13):i111-i119. [4] FISZMAN M, DEMNER-FUSHMAN D, KILICOGLU H, et al. Automatic summarization of MEDLINE citations for evidence-based medical treatment: a topic-oriented evaluation[J]. Journal of Biomedical Informatics, 2009, 42(5):801-813. [5] WORKMAN T E, HURDLE J F. Dynamic summarization of bibliographic-based data[J]. BMC Medical Informatics and Decision Making, 2011, 11(1):6. [6] CAMERON D, KAVULURU R, BODENREIDER O, et al. Semantic predications for complex information needs in biomedical literature[C] // 2011 IEEE International Conference on Bioinformatics and Biomedicine(BIBM)Los Alamitos: IEEE Computer Society, 2011: 512-519. [7] WORKMAN T E, FISZMAN M, HURDLE J F. Text summarization as a decision support aid[J]. BMC Medical Informatics and Decision Making, 2012, 12(1):41. [8] ZHANG H, FISZMAN M, SHIN D, et al. Clustering cliques for graph-based summarization of the biomedical research literature[J]. BMC Bioinformatics, 2013, 14(1):182. [9] RINDFLESCH T C, FISZMAN M, LIBBUS B. Semantic interpretation for the biomedical research literature[M] // CHEN H, FULLER WHS, FRIEDMAN C. Medical Informatics: Advances in Knowledge Management and Data Mining in Biomedicine. Springer US, 2005: 399-422. [10] 商玥, 林鸿飞, 杨志豪. 利用语义关系抽取生成生物医学文摘的算法[J]. 计算机科学与探索, 2011, 5(11):1027-1036. SHANG Yue, LIN Hongfei, YANG Zhihao. Automatic summarization algorithm for biomedical literature based on semantic relation extraction[J]. Journal of Frontiers of Computer Science and Technology, 2011, 5(11):1027-1036. [11] KULLBACK S, LEIBLER R A. On information and sufficiency[J]. The Annals of Mathematical Statistics, 1951, 22(1):79-86. [12] COVER T M, THOMAS J A. Elements of information theory[M]. [S.l.] : John Wiley & Sons, 2012. [13] RILOFF E. Automatically generating extraction patterns from untagged text[C] // Proceedings of the 13th National Conference on Artificial Intelligence and the 8th Znnovative Applications of Artificial Intelligence Conference. [S.l.] : AAAI, 1996: 1044-1049. [14] KANEHISA M, GOTO S, SATO Y, et al. Data, information, knowledge and principle: back to metabolism in KEGG[J]. Nucleic Acids Research, 2014, 42(D1):D199-D205. [15] KOTERA M, HIRAKAWA M, TOKIMATSU T, et al. The KEGG databases and tools facilitating omics analysis: latest developments involving human diseases and pharmaceuticals[M] // Next Generation Microarray Bioinformatics: Methods and Protocols. New York: Springer Press, 2012: 19-39. [16] KLUKAS C, SCHREIBER F. Dynamic exploration and editing of KEGG pathway diagrams[J]. Bioinformatics, 2007, 23(3):344-350. |
[1] | 苏丰龙,谢庆华,黄清泉,邱继远,岳振军. 基于直推式学习的半监督属性抽取[J]. 山东大学学报(理学版), 2016, 51(3): 111-115. |
[2] | 朱丽萍, 李洪奇, 杨中国, 刘蔷. 一种面向科技文献引言的信息抽取方法[J]. 山东大学学报(理学版), 2015, 50(07): 23-30. |
[3] | 王辉, 陈光. 基于Bootstrapping的英文产品评论属性词抽取方法[J]. 山东大学学报(理学版), 2014, 49(12): 23-29. |
[4] | 关冕,马军. 针对Web论坛的一种结构化数据自动抽取方法[J]. J4, 2010, 45(5): 42-47. |
[5] | 王 静,姚 勇,刘志镜 . 基于广义隐马尔可夫模型的网页信息抽取方法[J]. J4, 2007, 42(11): 49-52 . |
[6] | 王 雷,陈治平,李志成 . 基于文本分块的多模板隐马尔可夫模型的文本信息抽取[J]. J4, 2006, 41(3): 19-24 . |
|