Loading…

Advancing drug-target interaction prediction: a comprehensive graph-based approach integrating knowledge graph embedding and ProtBert pretraining

The pharmaceutical field faces a significant challenge in validating drug target interactions (DTIs) due to the time and cost involved, leading to only a fraction being experimentally verified. To expedite drug discovery, accurate computational methods are essential for predicting potential interact...

Full description

Saved in:
Bibliographic Details
Published in:BMC bioinformatics 2023-12, Vol.24 (1), p.488-41, Article 488
Main Authors: Djeddi, Warith Eddine, Hermi, Khalil, Ben Yahia, Sadok, Diallo, Gayo
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c632t-e3e4a44fce6ab2b74a23e97df921bc444544ff592b710cb08f6b7893f1a383d3
cites cdi_FETCH-LOGICAL-c632t-e3e4a44fce6ab2b74a23e97df921bc444544ff592b710cb08f6b7893f1a383d3
container_end_page 41
container_issue 1
container_start_page 488
container_title BMC bioinformatics
container_volume 24
creator Djeddi, Warith Eddine
Hermi, Khalil
Ben Yahia, Sadok
Diallo, Gayo
description The pharmaceutical field faces a significant challenge in validating drug target interactions (DTIs) due to the time and cost involved, leading to only a fraction being experimentally verified. To expedite drug discovery, accurate computational methods are essential for predicting potential interactions. Recently, machine learning techniques, particularly graph-based methods, have gained prominence. These methods utilize networks of drugs and targets, employing knowledge graph embedding (KGE) to represent structured information from knowledge graphs in a continuous vector space. This phenomenon highlights the growing inclination to utilize graph topologies as a means to improve the precision of predicting DTIs, hence addressing the pressing requirement for effective computational methodologies in the field of drug discovery. The present study presents a novel approach called DTIOG for the prediction of DTIs. The methodology employed in this study involves the utilization of a KGE strategy, together with the incorporation of contextual information obtained from protein sequences. More specifically, the study makes use of Protein Bidirectional Encoder Representations from Transformers (ProtBERT) for this purpose. DTIOG utilizes a two-step process to compute embedding vectors using KGE techniques. Additionally, it employs ProtBERT to determine target-target similarity. Different similarity measures, such as Cosine similarity or Euclidean distance, are utilized in the prediction procedure. In addition to the contextual embedding, the proposed unique approach incorporates local representations obtained from the Simplified Molecular Input Line Entry Specification (SMILES) of drugs and the amino acid sequences of protein targets. The effectiveness of the proposed approach was assessed through extensive experimentation on datasets pertaining to Enzymes, Ion Channels, and G-protein-coupled Receptors. The remarkable efficacy of DTIOG was showcased through the utilization of diverse similarity measures in order to calculate the similarities between drugs and targets. The combination of these factors, along with the incorporation of various classifiers, enabled the model to outperform existing algorithms in its ability to predict DTIs. The consistent observation of this advantage across all datasets underlines the robustness and accuracy of DTIOG in the domain of DTIs. Additionally, our case study suggests that the DTIOG can serve as a valuable tool for discovering ne
doi_str_mv 10.1186/s12859-023-05593-6
format article
fullrecord <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_d00eb7fc2a7743fca095ca36f2f5655c</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A776792630</galeid><doaj_id>oai_doaj_org_article_d00eb7fc2a7743fca095ca36f2f5655c</doaj_id><sourcerecordid>A776792630</sourcerecordid><originalsourceid>FETCH-LOGICAL-c632t-e3e4a44fce6ab2b74a23e97df921bc444544ff592b710cb08f6b7893f1a383d3</originalsourceid><addsrcrecordid>eNptks1uEzEUhUcIREvgBVigkdjQxRT_jmfYoFABrVQJBN1bHvt64pLYwXYCPAZvjCcppamQF7bu-c6xfXWr6jlGpxh37euEScf7BhHaIM572rQPqmPMBG4IRvzhnfNR9SSla4Sw6BB_XB3RDmPWU3Fc_Z6brfLa-bE2cTM2WcURcu18hqh0dsHX6wjG7Y5valXrsCqFBfjktlCPUa0XzaASmFqt1zEovdiZi5Cn0G8-_FiCGW_QGlYDGDMpypv6cwz5HcQ83ZGjcr4IT6tHVi0TPLvZZ9XVh_dXZ-fN5aePF2fzy0a3lOQGKDDFmNXQqoEMgilCoRfG9gQPmjHGi2h5XySM9IA62w6i66nFinbU0Fl1sY81QV3LdXQrFX_JoJzcFUIcpYrZ6SVIgxAMwmqihGDUaoV6rhVtLbG85VyXrLf7rPVmWIHR4Mtnlgehh4p3CzmGrcRIUNwRXBJO9gmLe77z-aWcaoiVVyPEthP76ua2GL5vIGW5cknDcqk8hE2SpEcM8xYXw6x6eQ-9DpvoS1sLhRkRghP0jxpV-a3zNpRH6ilUzoVoRU9aOlGn_6HKMrByOniwrtQPDCcHhsJk-JlHtUlJXnz9csiSPatjSCmCvW0CRnIadrkfdlmGXe6GXbbF9OJu228tf6eb_gHAefsI</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2914277520</pqid></control><display><type>article</type><title>Advancing drug-target interaction prediction: a comprehensive graph-based approach integrating knowledge graph embedding and ProtBert pretraining</title><source>PubMed (Medline)</source><source>Publicly Available Content Database</source><creator>Djeddi, Warith Eddine ; Hermi, Khalil ; Ben Yahia, Sadok ; Diallo, Gayo</creator><creatorcontrib>Djeddi, Warith Eddine ; Hermi, Khalil ; Ben Yahia, Sadok ; Diallo, Gayo</creatorcontrib><description>The pharmaceutical field faces a significant challenge in validating drug target interactions (DTIs) due to the time and cost involved, leading to only a fraction being experimentally verified. To expedite drug discovery, accurate computational methods are essential for predicting potential interactions. Recently, machine learning techniques, particularly graph-based methods, have gained prominence. These methods utilize networks of drugs and targets, employing knowledge graph embedding (KGE) to represent structured information from knowledge graphs in a continuous vector space. This phenomenon highlights the growing inclination to utilize graph topologies as a means to improve the precision of predicting DTIs, hence addressing the pressing requirement for effective computational methodologies in the field of drug discovery. The present study presents a novel approach called DTIOG for the prediction of DTIs. The methodology employed in this study involves the utilization of a KGE strategy, together with the incorporation of contextual information obtained from protein sequences. More specifically, the study makes use of Protein Bidirectional Encoder Representations from Transformers (ProtBERT) for this purpose. DTIOG utilizes a two-step process to compute embedding vectors using KGE techniques. Additionally, it employs ProtBERT to determine target-target similarity. Different similarity measures, such as Cosine similarity or Euclidean distance, are utilized in the prediction procedure. In addition to the contextual embedding, the proposed unique approach incorporates local representations obtained from the Simplified Molecular Input Line Entry Specification (SMILES) of drugs and the amino acid sequences of protein targets. The effectiveness of the proposed approach was assessed through extensive experimentation on datasets pertaining to Enzymes, Ion Channels, and G-protein-coupled Receptors. The remarkable efficacy of DTIOG was showcased through the utilization of diverse similarity measures in order to calculate the similarities between drugs and targets. The combination of these factors, along with the incorporation of various classifiers, enabled the model to outperform existing algorithms in its ability to predict DTIs. The consistent observation of this advantage across all datasets underlines the robustness and accuracy of DTIOG in the domain of DTIs. Additionally, our case study suggests that the DTIOG can serve as a valuable tool for discovering new DTIs.</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/s12859-023-05593-6</identifier><identifier>PMID: 38114937</identifier><language>eng</language><publisher>England: BioMed Central Ltd</publisher><subject>Algorithms ; Amino acids ; Analysis ; Case studies ; Clinical trials ; Computer applications ; Computer Science ; Cosine similarity ; COVID-19 ; Datasets ; Drug Development - methods ; Drug discovery ; Drug Interactions ; Drugs ; Drug–target interaction prediction ; Embedding ; Euclidean geometry ; Evaluation ; G protein-coupled receptors ; Ion channels ; Knowledge ; Knowledge Bases ; Knowledge graph embedding ; Knowledge representation ; Ligands ; Machine learning ; Neural networks ; Pandemics ; Pattern Recognition, Automated ; Predictions ; ProtBERT ; Proteins ; Proteins - chemistry ; Research methodology ; Severe acute respiratory syndrome coronavirus 2 ; Similarity ; Therapeutic targets ; Topology ; Vector spaces ; Viruses</subject><ispartof>BMC bioinformatics, 2023-12, Vol.24 (1), p.488-41, Article 488</ispartof><rights>2023. The Author(s).</rights><rights>COPYRIGHT 2023 BioMed Central Ltd.</rights><rights>2023. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><rights>The Author(s) 2023</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c632t-e3e4a44fce6ab2b74a23e97df921bc444544ff592b710cb08f6b7893f1a383d3</citedby><cites>FETCH-LOGICAL-c632t-e3e4a44fce6ab2b74a23e97df921bc444544ff592b710cb08f6b7893f1a383d3</cites><orcidid>0000-0001-8939-8948</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10731821/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2914277520?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,25753,27924,27925,37012,37013,44590,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38114937$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://hal.science/hal-04383004$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Djeddi, Warith Eddine</creatorcontrib><creatorcontrib>Hermi, Khalil</creatorcontrib><creatorcontrib>Ben Yahia, Sadok</creatorcontrib><creatorcontrib>Diallo, Gayo</creatorcontrib><title>Advancing drug-target interaction prediction: a comprehensive graph-based approach integrating knowledge graph embedding and ProtBert pretraining</title><title>BMC bioinformatics</title><addtitle>BMC Bioinformatics</addtitle><description>The pharmaceutical field faces a significant challenge in validating drug target interactions (DTIs) due to the time and cost involved, leading to only a fraction being experimentally verified. To expedite drug discovery, accurate computational methods are essential for predicting potential interactions. Recently, machine learning techniques, particularly graph-based methods, have gained prominence. These methods utilize networks of drugs and targets, employing knowledge graph embedding (KGE) to represent structured information from knowledge graphs in a continuous vector space. This phenomenon highlights the growing inclination to utilize graph topologies as a means to improve the precision of predicting DTIs, hence addressing the pressing requirement for effective computational methodologies in the field of drug discovery. The present study presents a novel approach called DTIOG for the prediction of DTIs. The methodology employed in this study involves the utilization of a KGE strategy, together with the incorporation of contextual information obtained from protein sequences. More specifically, the study makes use of Protein Bidirectional Encoder Representations from Transformers (ProtBERT) for this purpose. DTIOG utilizes a two-step process to compute embedding vectors using KGE techniques. Additionally, it employs ProtBERT to determine target-target similarity. Different similarity measures, such as Cosine similarity or Euclidean distance, are utilized in the prediction procedure. In addition to the contextual embedding, the proposed unique approach incorporates local representations obtained from the Simplified Molecular Input Line Entry Specification (SMILES) of drugs and the amino acid sequences of protein targets. The effectiveness of the proposed approach was assessed through extensive experimentation on datasets pertaining to Enzymes, Ion Channels, and G-protein-coupled Receptors. The remarkable efficacy of DTIOG was showcased through the utilization of diverse similarity measures in order to calculate the similarities between drugs and targets. The combination of these factors, along with the incorporation of various classifiers, enabled the model to outperform existing algorithms in its ability to predict DTIs. The consistent observation of this advantage across all datasets underlines the robustness and accuracy of DTIOG in the domain of DTIs. Additionally, our case study suggests that the DTIOG can serve as a valuable tool for discovering new DTIs.</description><subject>Algorithms</subject><subject>Amino acids</subject><subject>Analysis</subject><subject>Case studies</subject><subject>Clinical trials</subject><subject>Computer applications</subject><subject>Computer Science</subject><subject>Cosine similarity</subject><subject>COVID-19</subject><subject>Datasets</subject><subject>Drug Development - methods</subject><subject>Drug discovery</subject><subject>Drug Interactions</subject><subject>Drugs</subject><subject>Drug–target interaction prediction</subject><subject>Embedding</subject><subject>Euclidean geometry</subject><subject>Evaluation</subject><subject>G protein-coupled receptors</subject><subject>Ion channels</subject><subject>Knowledge</subject><subject>Knowledge Bases</subject><subject>Knowledge graph embedding</subject><subject>Knowledge representation</subject><subject>Ligands</subject><subject>Machine learning</subject><subject>Neural networks</subject><subject>Pandemics</subject><subject>Pattern Recognition, Automated</subject><subject>Predictions</subject><subject>ProtBERT</subject><subject>Proteins</subject><subject>Proteins - chemistry</subject><subject>Research methodology</subject><subject>Severe acute respiratory syndrome coronavirus 2</subject><subject>Similarity</subject><subject>Therapeutic targets</subject><subject>Topology</subject><subject>Vector spaces</subject><subject>Viruses</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNptks1uEzEUhUcIREvgBVigkdjQxRT_jmfYoFABrVQJBN1bHvt64pLYwXYCPAZvjCcppamQF7bu-c6xfXWr6jlGpxh37euEScf7BhHaIM572rQPqmPMBG4IRvzhnfNR9SSla4Sw6BB_XB3RDmPWU3Fc_Z6brfLa-bE2cTM2WcURcu18hqh0dsHX6wjG7Y5valXrsCqFBfjktlCPUa0XzaASmFqt1zEovdiZi5Cn0G8-_FiCGW_QGlYDGDMpypv6cwz5HcQ83ZGjcr4IT6tHVi0TPLvZZ9XVh_dXZ-fN5aePF2fzy0a3lOQGKDDFmNXQqoEMgilCoRfG9gQPmjHGi2h5XySM9IA62w6i66nFinbU0Fl1sY81QV3LdXQrFX_JoJzcFUIcpYrZ6SVIgxAMwmqihGDUaoV6rhVtLbG85VyXrLf7rPVmWIHR4Mtnlgehh4p3CzmGrcRIUNwRXBJO9gmLe77z-aWcaoiVVyPEthP76ua2GL5vIGW5cknDcqk8hE2SpEcM8xYXw6x6eQ-9DpvoS1sLhRkRghP0jxpV-a3zNpRH6ilUzoVoRU9aOlGn_6HKMrByOniwrtQPDCcHhsJk-JlHtUlJXnz9csiSPatjSCmCvW0CRnIadrkfdlmGXe6GXbbF9OJu228tf6eb_gHAefsI</recordid><startdate>20231219</startdate><enddate>20231219</enddate><creator>Djeddi, Warith Eddine</creator><creator>Hermi, Khalil</creator><creator>Ben Yahia, Sadok</creator><creator>Diallo, Gayo</creator><general>BioMed Central Ltd</general><general>BioMed Central</general><general>BMC</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7SC</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>7X8</scope><scope>1XC</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-8939-8948</orcidid></search><sort><creationdate>20231219</creationdate><title>Advancing drug-target interaction prediction: a comprehensive graph-based approach integrating knowledge graph embedding and ProtBert pretraining</title><author>Djeddi, Warith Eddine ; Hermi, Khalil ; Ben Yahia, Sadok ; Diallo, Gayo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c632t-e3e4a44fce6ab2b74a23e97df921bc444544ff592b710cb08f6b7893f1a383d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Amino acids</topic><topic>Analysis</topic><topic>Case studies</topic><topic>Clinical trials</topic><topic>Computer applications</topic><topic>Computer Science</topic><topic>Cosine similarity</topic><topic>COVID-19</topic><topic>Datasets</topic><topic>Drug Development - methods</topic><topic>Drug discovery</topic><topic>Drug Interactions</topic><topic>Drugs</topic><topic>Drug–target interaction prediction</topic><topic>Embedding</topic><topic>Euclidean geometry</topic><topic>Evaluation</topic><topic>G protein-coupled receptors</topic><topic>Ion channels</topic><topic>Knowledge</topic><topic>Knowledge Bases</topic><topic>Knowledge graph embedding</topic><topic>Knowledge representation</topic><topic>Ligands</topic><topic>Machine learning</topic><topic>Neural networks</topic><topic>Pandemics</topic><topic>Pattern Recognition, Automated</topic><topic>Predictions</topic><topic>ProtBERT</topic><topic>Proteins</topic><topic>Proteins - chemistry</topic><topic>Research methodology</topic><topic>Severe acute respiratory syndrome coronavirus 2</topic><topic>Similarity</topic><topic>Therapeutic targets</topic><topic>Topology</topic><topic>Vector spaces</topic><topic>Viruses</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Djeddi, Warith Eddine</creatorcontrib><creatorcontrib>Hermi, Khalil</creatorcontrib><creatorcontrib>Ben Yahia, Sadok</creatorcontrib><creatorcontrib>Diallo, Gayo</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ProQuest Biological Science Collection</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>PubMed Central (Full Participant titles)</collection><collection>Directory of Open Access Journals</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Djeddi, Warith Eddine</au><au>Hermi, Khalil</au><au>Ben Yahia, Sadok</au><au>Diallo, Gayo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Advancing drug-target interaction prediction: a comprehensive graph-based approach integrating knowledge graph embedding and ProtBert pretraining</atitle><jtitle>BMC bioinformatics</jtitle><addtitle>BMC Bioinformatics</addtitle><date>2023-12-19</date><risdate>2023</risdate><volume>24</volume><issue>1</issue><spage>488</spage><epage>41</epage><pages>488-41</pages><artnum>488</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>The pharmaceutical field faces a significant challenge in validating drug target interactions (DTIs) due to the time and cost involved, leading to only a fraction being experimentally verified. To expedite drug discovery, accurate computational methods are essential for predicting potential interactions. Recently, machine learning techniques, particularly graph-based methods, have gained prominence. These methods utilize networks of drugs and targets, employing knowledge graph embedding (KGE) to represent structured information from knowledge graphs in a continuous vector space. This phenomenon highlights the growing inclination to utilize graph topologies as a means to improve the precision of predicting DTIs, hence addressing the pressing requirement for effective computational methodologies in the field of drug discovery. The present study presents a novel approach called DTIOG for the prediction of DTIs. The methodology employed in this study involves the utilization of a KGE strategy, together with the incorporation of contextual information obtained from protein sequences. More specifically, the study makes use of Protein Bidirectional Encoder Representations from Transformers (ProtBERT) for this purpose. DTIOG utilizes a two-step process to compute embedding vectors using KGE techniques. Additionally, it employs ProtBERT to determine target-target similarity. Different similarity measures, such as Cosine similarity or Euclidean distance, are utilized in the prediction procedure. In addition to the contextual embedding, the proposed unique approach incorporates local representations obtained from the Simplified Molecular Input Line Entry Specification (SMILES) of drugs and the amino acid sequences of protein targets. The effectiveness of the proposed approach was assessed through extensive experimentation on datasets pertaining to Enzymes, Ion Channels, and G-protein-coupled Receptors. The remarkable efficacy of DTIOG was showcased through the utilization of diverse similarity measures in order to calculate the similarities between drugs and targets. The combination of these factors, along with the incorporation of various classifiers, enabled the model to outperform existing algorithms in its ability to predict DTIs. The consistent observation of this advantage across all datasets underlines the robustness and accuracy of DTIOG in the domain of DTIs. Additionally, our case study suggests that the DTIOG can serve as a valuable tool for discovering new DTIs.</abstract><cop>England</cop><pub>BioMed Central Ltd</pub><pmid>38114937</pmid><doi>10.1186/s12859-023-05593-6</doi><tpages>41</tpages><orcidid>https://orcid.org/0000-0001-8939-8948</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1471-2105
ispartof BMC bioinformatics, 2023-12, Vol.24 (1), p.488-41, Article 488
issn 1471-2105
1471-2105
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_d00eb7fc2a7743fca095ca36f2f5655c
source PubMed (Medline); Publicly Available Content Database
subjects Algorithms
Amino acids
Analysis
Case studies
Clinical trials
Computer applications
Computer Science
Cosine similarity
COVID-19
Datasets
Drug Development - methods
Drug discovery
Drug Interactions
Drugs
Drug–target interaction prediction
Embedding
Euclidean geometry
Evaluation
G protein-coupled receptors
Ion channels
Knowledge
Knowledge Bases
Knowledge graph embedding
Knowledge representation
Ligands
Machine learning
Neural networks
Pandemics
Pattern Recognition, Automated
Predictions
ProtBERT
Proteins
Proteins - chemistry
Research methodology
Severe acute respiratory syndrome coronavirus 2
Similarity
Therapeutic targets
Topology
Vector spaces
Viruses
title Advancing drug-target interaction prediction: a comprehensive graph-based approach integrating knowledge graph embedding and ProtBert pretraining
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T18%3A06%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Advancing%20drug-target%20interaction%20prediction:%20a%20comprehensive%20graph-based%20approach%20integrating%20knowledge%20graph%20embedding%20and%20ProtBert%20pretraining&rft.jtitle=BMC%20bioinformatics&rft.au=Djeddi,%20Warith%20Eddine&rft.date=2023-12-19&rft.volume=24&rft.issue=1&rft.spage=488&rft.epage=41&rft.pages=488-41&rft.artnum=488&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/s12859-023-05593-6&rft_dat=%3Cgale_doaj_%3EA776792630%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c632t-e3e4a44fce6ab2b74a23e97df921bc444544ff592b710cb08f6b7893f1a383d3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2914277520&rft_id=info:pmid/38114937&rft_galeid=A776792630&rfr_iscdi=true