Loading…

Pattern Learning through Distant Supervision for Extraction of Protein-Residue Associations in the Biomedical Literature

We propose a method enabling automatic extraction of protein-specific residues from the biomedical literature. We aim to associate mentions of specific amino acids to the protein of which the residue forms a part. The methods presented in this work will enable improved protein functional site extrac...

Full description

Saved in:
Bibliographic Details
Main Authors: Ravikumar, K. E., Haibin Liu, Cohn, J. D., Wall, M. E., Verspoor, K.
Format: Conference Proceeding
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We propose a method enabling automatic extraction of protein-specific residues from the biomedical literature. We aim to associate mentions of specific amino acids to the protein of which the residue forms a part. The methods presented in this work will enable improved protein functional site extraction from articles, ultimately supporting protein function prediction. Our method made use of linguistic patterns for identifying the amino acid residue mentions in text. Further, we applied an automated graph-based method to learn syntactic and semantic patterns corresponding to protein-residue pairs mentioned in the text. On a new automatically generated data set of high confidence protein-residue relationship sentences, established through distant supervision, the method achieved a F-measure of 0.78. This work will pave the way to improved extraction of protein functional residues from the literature.
DOI:10.1109/ICMLA.2011.112