服务承诺





51Due提供Essay,Paper,Report,Assignment等学科作业的代写与辅导,同时涵盖Personal Statement,转学申请等留学文书代写。




私人订制你的未来职场 世界名企,高端行业岗位等 在新的起点上实现更高水平的发展




Automatic Extraction of Protein--论文代写范文精选
2015-12-25 来源: 51due教员组 类别: Paper范文
51Due论文代写网精选paper代写范文:“Automatic Extraction of Protein-Protein Interaction ” 关于蛋白质,在生物医学的研究是非常重要的。随着生命科学的快速发展,生物医学文献不断上升非常快。目前,生物提取技术已经变得成熟,生物医学信息提取的研究变得越来越重要,它不仅是实践本身,除此之外,还起着关键的作用。现在,提取蛋白质已经成为一个热点,但也存在一些问题。这篇生物paper代写范文从两个方面做了研究,为改善提取的研究结果。对生物医学文献的特点,设计了一些新特性。
Abstract
Protein-protein interaction extraction is the key precondition of the construction of protein knowledge network, and it is very important for the research in the biomedicine. This paper extracted directional protein-protein interaction from the biological text, using the SVM-based method. Experiments were evaluated on the LLL05 corpus with good results. The results show that dependency features are import for the protein-protein interaction extraction and features related to the interaction word are effective for the interaction direction judgment. At last, we analyzed the effects of different features and planed for the next step.
Keywords: Support vector machines(SVM), Bio-Entity Relation, Protein-Protein Interaction, Entity Relation Direction.
1 Introduction
With the rapid development of life sciences, the biomedicine literature has been rising very fast. At present, the technology of information extraction has become mature already. As a result, the research in biomedicine information extraction is becoming more and more important, and relation extraction is one of the most important. Not only is it practical by itself, but it is also the foundation of the relation database and the biological knowledge network, besides it also plays a key role in the relation prediction and the drug producing. Now, the relation extraction has already become a hotspot, but there exists some problems, too. For instance, the result is not good enough, and some important information such as direction and type is ignored.
This paper did research from two aspects: improving the result and exacting more information about relation, direction for example. Towards the characters of biomedicine literature, we designed some new features, and extracted relation with the good machine learning model SVM, and the experiments showed that the results were good.
2 Extracting Protein-Protein Interaction
Once protein names have been found, the relationships between them need to be ascertained. The PPI extraction could be defined as a classification problem. When two protein names and one interaction word co-occur in a single sentence, then we could transfer the mission into inferring weather a PPI exist between the pair of proteins. So, firstly, the sentences were filtered by the simple rule that two protein names co-occur in one sentence. Secondly, we used a trained SVM model to solve this classification problem.
After relation extraction, we decided direction of the relation, because the direction is important to construct a biological network. We also transformed this problem into classification.
3 Results
SVM model was trained on the standard corpus LLL05 corpus(J. Hakenberg, et al., 2005) and the effective features (word features, POS features, logic features and dependency parsing features). In this experiment, we get 38,504 proteins and 51,568 PPIs between them through the SVM-based method.
The SVM-based medel trained on the LLL05 corpus achieves a good preferment of 82.4% precision, 73.7% Recall and 77.8% F-score. The experiments on LLL05 corpus showed that the F value was as high as 80% and the new features had improved the results a lot. In conclusion, the syntactic features had improved both the precision and the recall while the logic features had improved the recall. What’s more, the syntactic features could make a good result even by itself.
result of protein-protein interaction experiment
feature |
Word + POS |
+ Logic |
+ Syntax |
+ Logic + Syntax |
precision |
81.82 |
75.00 |
91.67 |
82.35 |
recall |
47.37 |
47.37 |
57.89 |
73.68 |
F value |
60.00 |
58.06 |
70.97 |
77.78 |
result of direction judgement experiment
feature
measure |
Phisical + Clause |
Subtree + Clause |
Phisical + Subtree + Clause |
|||
direction |
inverse |
direction |
inverse |
direction |
inverse |
|
precision |
83.33 |
100.00 |
80.00 |
80.00 |
83.33 |
100.00 |
recall |
100.00 |
80.00 |
80.00 |
80.00 |
100.00 |
80.00 |
F value |
90.91 |
88.89 |
80.00 |
80.00 |
90.91 |
88.89 |
4 Conclusions
This paper extracted several groups of rational features according to the characteristic of protein-protein interaction, and designed the dependency features according to the result of the dependency parsing, which improved the experiment’s effect. Then, this paper extracted some features related to the interaction word, and decided the interaction direction, which provided more effective information for the construction of protein knowledge network and biological entity relation network. We conducted experiments on LLL05 corpus, and analyzed the effect of every features. The results showed that the new designed features had effectively improved the results.
Future work include: validating the expansibility of our method, improving the relation extraction more and constructing the visible biological knowledge network.
References
[1] Minlie Huang, Shilin Ding, Hongning Wang, et al. Mining physical protein-protein interactions from the literature. Genome Biology 2008. pp.1-13.
[2] Martin Krallinger, Alfonso Valencia. Text-mining approaches in molecular biology and biomedicine. Biosilico Vol. 10, no.6, 2005, pp.1-7.
[3] Alexander Schutz, Paul Buitelaar. RelExt: A Tool for Relation Extraction from Text in Ontology Extension. ISWC 2005, 2005, pp. 593-606.
[4] Deyu Zhou, Yulan He, Chee Keong Kwoh. Extracting Protein-Protein Interactions from the Literature Using the Hidden Vector State Model. ICCS, Part II, 2006, pp.718-725.
[5] Chengjie Sun, Lei Lin, Xiaolong Wang et al. Using Maximum Entropy Model to Extract Protein-Protein Interaction Information from Biomedical Literature. ICIC 2007.LNCS 4681 , 2007, pp.730-737.
[6] Deyu Zhou, Yulan He, hee Keong Kwoh. Extracting Protein-Protein Interactions from the Literature Using the Hidden Vector State Model.. ICCS 2006, LNCS 3992, 2006, pp. 718–725.
[7] Muller HM, Kenny EE, Sternberg PW. Textpresso: An ontology-based information retrieval and extraction system for biological literature. PLoS Biol Vol. 2, no.11, 2004.
[8] Seonho Kim1, Juntae Yoon, and Jihoon Yang. Kernel approaches for genic interaction extraction. Bioinformatics Vol. 24, no.1, 2008, pp 118-126.
[9] Nazar Zaki ,Sanja Lazarova-Molnar, Wassim et al. Protein-protein interaction based on pairwise similarity. http://www.biomedcentral.com/1471-2105/10/150. Bioinformatics 2009.
[10] Nello Cristianini, John Shawe-Taylor. An Introduction to Support Vector Machines and other kernel-based learning methods. Cambridge University Press, Cambridge, 2000.
[11] Vapnik, V. Statistical Learning Theory. John Wiley (1998).
[12] Sampo Pysalo, Antti Airola, Juho Heimonen. Comparative analysis of five protein-protein interaction corpora. Bioinformatics. 2008.9, pp.1-3.
[13] stanford-postagger: http://www-nlp.stanford.edu/software/tagger.shtml
[14] libSVM: http://www.csie.ntu.edu.tw/~cjlin/libsvm/
[15] Fundel K, Kuffner R, Zimmer R. RelEx–Relation extraction using dependency parse trees. Bioinformatics 2007, pp: 365-371.
51Due网站原创范文除特殊说明外一切图文著作权归51Due所有;未经51Due官方授权谢绝任何用途转载或刊发于媒体。如发生侵犯著作权现象,51Due保留一切法律追诉权。(paper代写)
更多paper代写范文欢迎访问我们主页 www.51due.com 当然有paper代写需求可以和我们24小时在线客服 QQ:800020041 联系交流。-X(paper代写)
