A Graph Kernel for Protein-Protein Interaction Extraction
: Airola A, Pyysalo S, Björne J, Pahikkala T, Ginter F, Salakoski T
: Dina Demner-Fushman, Sophia Ananiadou, K. Bretonnel Cohen, John Pestian, Jun’ichi Tsujii, Bonnie Webber
: Workshop on biomedical natural language processing
Publisher: Association for Computational Linguistics
: 2008
: Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing (BioNLP 2008)
: http://aclweb.org/anthology-new/W/W08/W08-0601.pdf
In this paper, we propose a graph kernel based approach for the automated extraction of protein-protein interactions (PPI) from scientific literature. In contrast to earlier approaches to PPI extraction, the introduced all-dependency-paths kernel has the capability to consider full, general dependency graphs. We evaluate the proposed method across five publicly available PPI corpora providing the most comprehensive evaluation done for a machine learning based PPI-extraction system. Our method is shown to achieve state-of-the-art performance with respect to comparable evaluations, achieving 56.4 F-score and 84.8 AUC on the AImed corpus. Further, we identify several pitfalls that can make evaluations of PPI-extraction systems incomparable, or even invalid. These include incorrect cross-validation strategies and problems related to comparing F-score results achieved on different evaluation resources.