Loading…

Pattern Learning through Distant Supervision for Extraction of Protein-Residue Associations in the Biomedical Literature

We propose a method enabling automatic extraction of protein-specific residues from the biomedical literature. We aim to associate mentions of specific amino acids to the protein of which the residue forms a part. The methods presented in this work will enable improved protein functional site extrac...

Full description

Saved in:
Bibliographic Details
Main Authors: Ravikumar, K. E., Haibin Liu, Cohn, J. D., Wall, M. E., Verspoor, K.
Format: Conference Proceeding
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c135t-d3d65138e718500191e959d3bb9ff121c1b02eab0d8833b888df916f800537033
cites
container_end_page 65
container_issue
container_start_page 59
container_title
container_volume 2
creator Ravikumar, K. E.
Haibin Liu
Cohn, J. D.
Wall, M. E.
Verspoor, K.
description We propose a method enabling automatic extraction of protein-specific residues from the biomedical literature. We aim to associate mentions of specific amino acids to the protein of which the residue forms a part. The methods presented in this work will enable improved protein functional site extraction from articles, ultimately supporting protein function prediction. Our method made use of linguistic patterns for identifying the amino acid residue mentions in text. Further, we applied an automated graph-based method to learn syntactic and semantic patterns corresponding to protein-residue pairs mentioned in the text. On a new automatically generated data set of high confidence protein-residue relationship sentences, established through distant supervision, the method achieved a F-measure of 0.78. This work will pave the way to improved extraction of protein functional residues from the literature.
doi_str_mv 10.1109/ICMLA.2011.112
format conference_proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6147049</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6147049</ieee_id><sourcerecordid>6147049</sourcerecordid><originalsourceid>FETCH-LOGICAL-c135t-d3d65138e718500191e959d3bb9ff121c1b02eab0d8833b888df916f800537033</originalsourceid><addsrcrecordid>eNotjctOwzAURI0QElC6ZcPGP5Dia8evZSktVAqi4rGunOSmNWqTynZQ-XtSwWxGRyOdIeQW2ASA2fvl7KWYTjgDGJifkWumlZW5Ypqfk7HVBnKpNQeR80syjvGLDVHKWtBX5LhyKWFoaYEutL7d0LQNXb_Z0kcfk2sTfe8PGL599F1Lmy7Q-TEFV6UTdg1dhS6hb7M3jL7ukU5j7CrvTnOkvh1sSB98t8faV25HCz-cudQHvCEXjdtFHP_3iHwu5h-z56x4fVrOpkVWgZApq0WtJAiDGoxkDCyglbYWZWmbBjhUUDKOrmS1MUKUxpi6saAaw5gUmgkxInd_Xo-I60Pwexd-1gpyzXIrfgHW0V5M</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Pattern Learning through Distant Supervision for Extraction of Protein-Residue Associations in the Biomedical Literature</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Ravikumar, K. E. ; Haibin Liu ; Cohn, J. D. ; Wall, M. E. ; Verspoor, K.</creator><creatorcontrib>Ravikumar, K. E. ; Haibin Liu ; Cohn, J. D. ; Wall, M. E. ; Verspoor, K.</creatorcontrib><description>We propose a method enabling automatic extraction of protein-specific residues from the biomedical literature. We aim to associate mentions of specific amino acids to the protein of which the residue forms a part. The methods presented in this work will enable improved protein functional site extraction from articles, ultimately supporting protein function prediction. Our method made use of linguistic patterns for identifying the amino acid residue mentions in text. Further, we applied an automated graph-based method to learn syntactic and semantic patterns corresponding to protein-residue pairs mentioned in the text. On a new automatically generated data set of high confidence protein-residue relationship sentences, established through distant supervision, the method achieved a F-measure of 0.78. This work will pave the way to improved extraction of protein functional residues from the literature.</description><identifier>ISBN: 9781457721342</identifier><identifier>ISBN: 1457721341</identifier><identifier>EISBN: 0769546072</identifier><identifier>EISBN: 9780769546070</identifier><identifier>DOI: 10.1109/ICMLA.2011.112</identifier><language>eng</language><publisher>IEEE</publisher><subject>Abstracts ; Amino acids ; Data mining ; distant supervision ; Gold ; information extraction ; Mutation mining ; pattern learning ; Protein engineering ; protein residue mining ; Proteins ; Silver ; text mining</subject><ispartof>2011 10th International Conference on Machine Learning and Applications and Workshops, 2011, Vol.2, p.59-65</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c135t-d3d65138e718500191e959d3bb9ff121c1b02eab0d8833b888df916f800537033</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6147049$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,2052,27904,54898</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6147049$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Ravikumar, K. E.</creatorcontrib><creatorcontrib>Haibin Liu</creatorcontrib><creatorcontrib>Cohn, J. D.</creatorcontrib><creatorcontrib>Wall, M. E.</creatorcontrib><creatorcontrib>Verspoor, K.</creatorcontrib><title>Pattern Learning through Distant Supervision for Extraction of Protein-Residue Associations in the Biomedical Literature</title><title>2011 10th International Conference on Machine Learning and Applications and Workshops</title><addtitle>icmla</addtitle><description>We propose a method enabling automatic extraction of protein-specific residues from the biomedical literature. We aim to associate mentions of specific amino acids to the protein of which the residue forms a part. The methods presented in this work will enable improved protein functional site extraction from articles, ultimately supporting protein function prediction. Our method made use of linguistic patterns for identifying the amino acid residue mentions in text. Further, we applied an automated graph-based method to learn syntactic and semantic patterns corresponding to protein-residue pairs mentioned in the text. On a new automatically generated data set of high confidence protein-residue relationship sentences, established through distant supervision, the method achieved a F-measure of 0.78. This work will pave the way to improved extraction of protein functional residues from the literature.</description><subject>Abstracts</subject><subject>Amino acids</subject><subject>Data mining</subject><subject>distant supervision</subject><subject>Gold</subject><subject>information extraction</subject><subject>Mutation mining</subject><subject>pattern learning</subject><subject>Protein engineering</subject><subject>protein residue mining</subject><subject>Proteins</subject><subject>Silver</subject><subject>text mining</subject><isbn>9781457721342</isbn><isbn>1457721341</isbn><isbn>0769546072</isbn><isbn>9780769546070</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2011</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotjctOwzAURI0QElC6ZcPGP5Dia8evZSktVAqi4rGunOSmNWqTynZQ-XtSwWxGRyOdIeQW2ASA2fvl7KWYTjgDGJifkWumlZW5Ypqfk7HVBnKpNQeR80syjvGLDVHKWtBX5LhyKWFoaYEutL7d0LQNXb_Z0kcfk2sTfe8PGL599F1Lmy7Q-TEFV6UTdg1dhS6hb7M3jL7ukU5j7CrvTnOkvh1sSB98t8faV25HCz-cudQHvCEXjdtFHP_3iHwu5h-z56x4fVrOpkVWgZApq0WtJAiDGoxkDCyglbYWZWmbBjhUUDKOrmS1MUKUxpi6saAaw5gUmgkxInd_Xo-I60Pwexd-1gpyzXIrfgHW0V5M</recordid><startdate>201112</startdate><enddate>201112</enddate><creator>Ravikumar, K. E.</creator><creator>Haibin Liu</creator><creator>Cohn, J. D.</creator><creator>Wall, M. E.</creator><creator>Verspoor, K.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201112</creationdate><title>Pattern Learning through Distant Supervision for Extraction of Protein-Residue Associations in the Biomedical Literature</title><author>Ravikumar, K. E. ; Haibin Liu ; Cohn, J. D. ; Wall, M. E. ; Verspoor, K.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c135t-d3d65138e718500191e959d3bb9ff121c1b02eab0d8833b888df916f800537033</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Abstracts</topic><topic>Amino acids</topic><topic>Data mining</topic><topic>distant supervision</topic><topic>Gold</topic><topic>information extraction</topic><topic>Mutation mining</topic><topic>pattern learning</topic><topic>Protein engineering</topic><topic>protein residue mining</topic><topic>Proteins</topic><topic>Silver</topic><topic>text mining</topic><toplevel>online_resources</toplevel><creatorcontrib>Ravikumar, K. E.</creatorcontrib><creatorcontrib>Haibin Liu</creatorcontrib><creatorcontrib>Cohn, J. D.</creatorcontrib><creatorcontrib>Wall, M. E.</creatorcontrib><creatorcontrib>Verspoor, K.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ravikumar, K. E.</au><au>Haibin Liu</au><au>Cohn, J. D.</au><au>Wall, M. E.</au><au>Verspoor, K.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Pattern Learning through Distant Supervision for Extraction of Protein-Residue Associations in the Biomedical Literature</atitle><btitle>2011 10th International Conference on Machine Learning and Applications and Workshops</btitle><stitle>icmla</stitle><date>2011-12</date><risdate>2011</risdate><volume>2</volume><spage>59</spage><epage>65</epage><pages>59-65</pages><isbn>9781457721342</isbn><isbn>1457721341</isbn><eisbn>0769546072</eisbn><eisbn>9780769546070</eisbn><abstract>We propose a method enabling automatic extraction of protein-specific residues from the biomedical literature. We aim to associate mentions of specific amino acids to the protein of which the residue forms a part. The methods presented in this work will enable improved protein functional site extraction from articles, ultimately supporting protein function prediction. Our method made use of linguistic patterns for identifying the amino acid residue mentions in text. Further, we applied an automated graph-based method to learn syntactic and semantic patterns corresponding to protein-residue pairs mentioned in the text. On a new automatically generated data set of high confidence protein-residue relationship sentences, established through distant supervision, the method achieved a F-measure of 0.78. This work will pave the way to improved extraction of protein functional residues from the literature.</abstract><pub>IEEE</pub><doi>10.1109/ICMLA.2011.112</doi><tpages>7</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISBN: 9781457721342
ispartof 2011 10th International Conference on Machine Learning and Applications and Workshops, 2011, Vol.2, p.59-65
issn
language eng
recordid cdi_ieee_primary_6147049
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Abstracts
Amino acids
Data mining
distant supervision
Gold
information extraction
Mutation mining
pattern learning
Protein engineering
protein residue mining
Proteins
Silver
text mining
title Pattern Learning through Distant Supervision for Extraction of Protein-Residue Associations in the Biomedical Literature
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T06%3A08%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Pattern%20Learning%20through%20Distant%20Supervision%20for%20Extraction%20of%20Protein-Residue%20Associations%20in%20the%20Biomedical%20Literature&rft.btitle=2011%2010th%20International%20Conference%20on%20Machine%20Learning%20and%20Applications%20and%20Workshops&rft.au=Ravikumar,%20K.%20E.&rft.date=2011-12&rft.volume=2&rft.spage=59&rft.epage=65&rft.pages=59-65&rft.isbn=9781457721342&rft.isbn_list=1457721341&rft_id=info:doi/10.1109/ICMLA.2011.112&rft.eisbn=0769546072&rft.eisbn_list=9780769546070&rft_dat=%3Cieee_6IE%3E6147049%3C/ieee_6IE%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c135t-d3d65138e718500191e959d3bb9ff121c1b02eab0d8833b888df916f800537033%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6147049&rfr_iscdi=true