Loading…

Literature mining for the discovery of hidden connections between drugs, genes and diseases

The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use...

Full description

Saved in:
Bibliographic Details
Published in:PLoS computational biology 2010-09, Vol.6 (9), p.e1000943-136
Main Authors: Frijters, Raoul, van Vugt, Marianne, Smeets, Ruben, van Schaik, René, de Vlieg, Jacob, Alkema, Wynand
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c664t-c84484dbb4c0f76c31fdd31996809deb480f09c259ccb875c44b24033b3ad72f3
cites cdi_FETCH-LOGICAL-c664t-c84484dbb4c0f76c31fdd31996809deb480f09c259ccb875c44b24033b3ad72f3
container_end_page 136
container_issue 9
container_start_page e1000943
container_title PLoS computational biology
container_volume 6
creator Frijters, Raoul
van Vugt, Marianne
Smeets, Ruben
van Schaik, René
de Vlieg, Jacob
Alkema, Wynand
description The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also well-suited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs.
doi_str_mv 10.1371/journal.pcbi.1000943
format article
fullrecord <record><control><sourceid>gale_plos_</sourceid><recordid>TN_cdi_plos_journals_1313173155</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A238912229</galeid><doaj_id>oai_doaj_org_article_e2699e9fc527418bbf2dec785f3f5188</doaj_id><sourcerecordid>A238912229</sourcerecordid><originalsourceid>FETCH-LOGICAL-c664t-c84484dbb4c0f76c31fdd31996809deb480f09c259ccb875c44b24033b3ad72f3</originalsourceid><addsrcrecordid>eNqVkl2LEzEUhgdR3LX6D0QHvBDB1nzNJLkRlsWPQlHw48qLkElOplmmSTeZWd1_b2q7yxYEkVwknDzve8ibU1VPMVpgyvGbiziloIfF1nR-gRFCktF71SluGjrntBH375xPqkc5XyBUjrJ9WJ0QJETDuTitfqz8CEmPU4J644MPfe1iqsc11NZnE68gXdfR1WtvLYTaxBDAjD6GXHcw_oRSs2nq8-u6hwC51sHuhKAz5MfVA6eHDE8O-6z6_v7dt_OP89XnD8vzs9XctC0b50YwJpjtOmaQ462h2FlLsZStQNJCxwRySBrSSGM6wRvDWEcYorSj2nLi6Kx6vvfdDjGrQy5ZYVoWp7sQZtVyT9ioL9Q2-Y1O1ypqr_4UYuqVTqM3AyggrZQgnWkIZ1h0nSMWDBeNo67BQhSvt4duU7cBayCMSQ9Hpsc3wa9VH68UkYxxgYrBy4NBipcT5FFtStIwDDpAnLISnGAiJG7-SfKmbVvMRFvIF3uy1-UNPrhYWpsdrc4ILWaEEFmoxV-osixsfPlacL7UjwSvjgSFGeHX2OspZ7X8-uU_2E_HLNuzJsWcE7jb-DBSu-m--UW1m251mO4ie3Y3-lvRzTjT325O9h8</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>756661486</pqid></control><display><type>article</type><title>Literature mining for the discovery of hidden connections between drugs, genes and diseases</title><source>PubMed Central(OA)</source><source>ProQuest - Publicly Available Content Database</source><creator>Frijters, Raoul ; van Vugt, Marianne ; Smeets, Ruben ; van Schaik, René ; de Vlieg, Jacob ; Alkema, Wynand</creator><contributor>Rzhetsky, Andrey</contributor><creatorcontrib>Frijters, Raoul ; van Vugt, Marianne ; Smeets, Ruben ; van Schaik, René ; de Vlieg, Jacob ; Alkema, Wynand ; Rzhetsky, Andrey</creatorcontrib><description>The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also well-suited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs.</description><identifier>ISSN: 1553-7358</identifier><identifier>ISSN: 1553-734X</identifier><identifier>EISSN: 1553-7358</identifier><identifier>DOI: 10.1371/journal.pcbi.1000943</identifier><identifier>PMID: 20885778</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Apoptosis ; Computational biology ; Computational Biology - methods ; Computational Biology/Literature Analysis ; Data Mining - methods ; Disease ; Drug Discovery ; Evaluation ; Experiments ; Genes ; Humans ; Hypotheses ; Leukocytes, Mononuclear - physiology ; MEDLINE ; Metabolic Networks and Pathways ; Methods ; Molecular Biology/Bioinformatics ; Pathology ; Pattern Recognition, Automated - methods ; Pharmaceutical Preparations ; Pharmacology/Drug Development ; Reproducibility of Results ; ROC Curve ; Signal Transduction ; Software ; Technology application</subject><ispartof>PLoS computational biology, 2010-09, Vol.6 (9), p.e1000943-136</ispartof><rights>COPYRIGHT 2010 Public Library of Science</rights><rights>Frijters et al. 2010</rights><rights>2010 Frijters et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited: Frijters R, van Vugt M, Smeets R, van Schaik R, de Vlieg J, et al. (2010) Literature Mining for the Discovery of Hidden Connections between Drugs, Genes and Diseases. PLoS Comput Biol 6(9): e1000943. doi:10.1371/journal.pcbi.1000943</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c664t-c84484dbb4c0f76c31fdd31996809deb480f09c259ccb875c44b24033b3ad72f3</citedby><cites>FETCH-LOGICAL-c664t-c84484dbb4c0f76c31fdd31996809deb480f09c259ccb875c44b24033b3ad72f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2944780/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2944780/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,27924,27925,37013,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/20885778$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Rzhetsky, Andrey</contributor><creatorcontrib>Frijters, Raoul</creatorcontrib><creatorcontrib>van Vugt, Marianne</creatorcontrib><creatorcontrib>Smeets, Ruben</creatorcontrib><creatorcontrib>van Schaik, René</creatorcontrib><creatorcontrib>de Vlieg, Jacob</creatorcontrib><creatorcontrib>Alkema, Wynand</creatorcontrib><title>Literature mining for the discovery of hidden connections between drugs, genes and diseases</title><title>PLoS computational biology</title><addtitle>PLoS Comput Biol</addtitle><description>The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also well-suited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs.</description><subject>Apoptosis</subject><subject>Computational biology</subject><subject>Computational Biology - methods</subject><subject>Computational Biology/Literature Analysis</subject><subject>Data Mining - methods</subject><subject>Disease</subject><subject>Drug Discovery</subject><subject>Evaluation</subject><subject>Experiments</subject><subject>Genes</subject><subject>Humans</subject><subject>Hypotheses</subject><subject>Leukocytes, Mononuclear - physiology</subject><subject>MEDLINE</subject><subject>Metabolic Networks and Pathways</subject><subject>Methods</subject><subject>Molecular Biology/Bioinformatics</subject><subject>Pathology</subject><subject>Pattern Recognition, Automated - methods</subject><subject>Pharmaceutical Preparations</subject><subject>Pharmacology/Drug Development</subject><subject>Reproducibility of Results</subject><subject>ROC Curve</subject><subject>Signal Transduction</subject><subject>Software</subject><subject>Technology application</subject><issn>1553-7358</issn><issn>1553-734X</issn><issn>1553-7358</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><sourceid>DOA</sourceid><recordid>eNqVkl2LEzEUhgdR3LX6D0QHvBDB1nzNJLkRlsWPQlHw48qLkElOplmmSTeZWd1_b2q7yxYEkVwknDzve8ibU1VPMVpgyvGbiziloIfF1nR-gRFCktF71SluGjrntBH375xPqkc5XyBUjrJ9WJ0QJETDuTitfqz8CEmPU4J644MPfe1iqsc11NZnE68gXdfR1WtvLYTaxBDAjD6GXHcw_oRSs2nq8-u6hwC51sHuhKAz5MfVA6eHDE8O-6z6_v7dt_OP89XnD8vzs9XctC0b50YwJpjtOmaQ462h2FlLsZStQNJCxwRySBrSSGM6wRvDWEcYorSj2nLi6Kx6vvfdDjGrQy5ZYVoWp7sQZtVyT9ioL9Q2-Y1O1ypqr_4UYuqVTqM3AyggrZQgnWkIZ1h0nSMWDBeNo67BQhSvt4duU7cBayCMSQ9Hpsc3wa9VH68UkYxxgYrBy4NBipcT5FFtStIwDDpAnLISnGAiJG7-SfKmbVvMRFvIF3uy1-UNPrhYWpsdrc4ILWaEEFmoxV-osixsfPlacL7UjwSvjgSFGeHX2OspZ7X8-uU_2E_HLNuzJsWcE7jb-DBSu-m--UW1m251mO4ie3Y3-lvRzTjT325O9h8</recordid><startdate>20100901</startdate><enddate>20100901</enddate><creator>Frijters, Raoul</creator><creator>van Vugt, Marianne</creator><creator>Smeets, Ruben</creator><creator>van Schaik, René</creator><creator>de Vlieg, Jacob</creator><creator>Alkema, Wynand</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISN</scope><scope>ISR</scope><scope>7X8</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>RC3</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20100901</creationdate><title>Literature mining for the discovery of hidden connections between drugs, genes and diseases</title><author>Frijters, Raoul ; van Vugt, Marianne ; Smeets, Ruben ; van Schaik, René ; de Vlieg, Jacob ; Alkema, Wynand</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c664t-c84484dbb4c0f76c31fdd31996809deb480f09c259ccb875c44b24033b3ad72f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Apoptosis</topic><topic>Computational biology</topic><topic>Computational Biology - methods</topic><topic>Computational Biology/Literature Analysis</topic><topic>Data Mining - methods</topic><topic>Disease</topic><topic>Drug Discovery</topic><topic>Evaluation</topic><topic>Experiments</topic><topic>Genes</topic><topic>Humans</topic><topic>Hypotheses</topic><topic>Leukocytes, Mononuclear - physiology</topic><topic>MEDLINE</topic><topic>Metabolic Networks and Pathways</topic><topic>Methods</topic><topic>Molecular Biology/Bioinformatics</topic><topic>Pathology</topic><topic>Pattern Recognition, Automated - methods</topic><topic>Pharmaceutical Preparations</topic><topic>Pharmacology/Drug Development</topic><topic>Reproducibility of Results</topic><topic>ROC Curve</topic><topic>Signal Transduction</topic><topic>Software</topic><topic>Technology application</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Frijters, Raoul</creatorcontrib><creatorcontrib>van Vugt, Marianne</creatorcontrib><creatorcontrib>Smeets, Ruben</creatorcontrib><creatorcontrib>van Schaik, René</creatorcontrib><creatorcontrib>de Vlieg, Jacob</creatorcontrib><creatorcontrib>Alkema, Wynand</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Canada</collection><collection>Gale In Context: Science</collection><collection>MEDLINE - Academic</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PLoS computational biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Frijters, Raoul</au><au>van Vugt, Marianne</au><au>Smeets, Ruben</au><au>van Schaik, René</au><au>de Vlieg, Jacob</au><au>Alkema, Wynand</au><au>Rzhetsky, Andrey</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Literature mining for the discovery of hidden connections between drugs, genes and diseases</atitle><jtitle>PLoS computational biology</jtitle><addtitle>PLoS Comput Biol</addtitle><date>2010-09-01</date><risdate>2010</risdate><volume>6</volume><issue>9</issue><spage>e1000943</spage><epage>136</epage><pages>e1000943-136</pages><issn>1553-7358</issn><issn>1553-734X</issn><eissn>1553-7358</eissn><abstract>The scientific literature represents a rich source for retrieval of knowledge on associations between biomedical concepts such as genes, diseases and cellular processes. A commonly used method to establish relationships between biomedical concepts from literature is co-occurrence. Apart from its use in knowledge retrieval, the co-occurrence method is also well-suited to discover new, hidden relationships between biomedical concepts following a simple ABC-principle, in which A and C have no direct relationship, but are connected via shared B-intermediates. In this paper we describe CoPub Discovery, a tool that mines the literature for new relationships between biomedical concepts. Statistical analysis using ROC curves showed that CoPub Discovery performed well over a wide range of settings and keyword thesauri. We subsequently used CoPub Discovery to search for new relationships between genes, drugs, pathways and diseases. Several of the newly found relationships were validated using independent literature sources. In addition, new predicted relationships between compounds and cell proliferation were validated and confirmed experimentally in an in vitro cell proliferation assay. The results show that CoPub Discovery is able to identify novel associations between genes, drugs, pathways and diseases that have a high probability of being biologically valid. This makes CoPub Discovery a useful tool to unravel the mechanisms behind disease, to find novel drug targets, or to find novel applications for existing drugs.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>20885778</pmid><doi>10.1371/journal.pcbi.1000943</doi><tpages>3</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1553-7358
ispartof PLoS computational biology, 2010-09, Vol.6 (9), p.e1000943-136
issn 1553-7358
1553-734X
1553-7358
language eng
recordid cdi_plos_journals_1313173155
source PubMed Central(OA); ProQuest - Publicly Available Content Database
subjects Apoptosis
Computational biology
Computational Biology - methods
Computational Biology/Literature Analysis
Data Mining - methods
Disease
Drug Discovery
Evaluation
Experiments
Genes
Humans
Hypotheses
Leukocytes, Mononuclear - physiology
MEDLINE
Metabolic Networks and Pathways
Methods
Molecular Biology/Bioinformatics
Pathology
Pattern Recognition, Automated - methods
Pharmaceutical Preparations
Pharmacology/Drug Development
Reproducibility of Results
ROC Curve
Signal Transduction
Software
Technology application
title Literature mining for the discovery of hidden connections between drugs, genes and diseases
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T09%3A57%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Literature%20mining%20for%20the%20discovery%20of%20hidden%20connections%20between%20drugs,%20genes%20and%20diseases&rft.jtitle=PLoS%20computational%20biology&rft.au=Frijters,%20Raoul&rft.date=2010-09-01&rft.volume=6&rft.issue=9&rft.spage=e1000943&rft.epage=136&rft.pages=e1000943-136&rft.issn=1553-7358&rft.eissn=1553-7358&rft_id=info:doi/10.1371/journal.pcbi.1000943&rft_dat=%3Cgale_plos_%3EA238912229%3C/gale_plos_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c664t-c84484dbb4c0f76c31fdd31996809deb480f09c259ccb875c44b24033b3ad72f3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=756661486&rft_id=info:pmid/20885778&rft_galeid=A238912229&rfr_iscdi=true