Loading…

Multimodal representation learning for predicting molecule-disease relations

Predicting molecule-disease indications and side effects is important for drug development and pharmacovigilance. Comprehensively mining molecule-molecule, molecule-disease and disease-disease semantic dependencies can potentially improve prediction performance. We introduce a Multi-Modal REpresenta...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics (Oxford, England) England), 2023-02, Vol.39 (2)
Main Authors: Wen, Jun, Zhang, Xiang, Rush, Everett, Panickan, Vidul A, Li, Xingyu, Cai, Tianrun, Zhou, Doudou, Ho, Yuk-Lam, Costa, Lauren, Begoli, Edmon, Hong, Chuan, Gaziano, J Michael, Cho, Kelly, Lu, Junwei, Liao, Katherine P, Zitnik, Marinka, Cai, Tianxi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Predicting molecule-disease indications and side effects is important for drug development and pharmacovigilance. Comprehensively mining molecule-molecule, molecule-disease and disease-disease semantic dependencies can potentially improve prediction performance. We introduce a Multi-Modal REpresentation Mapping Approach to Predicting molecular-disease relations (M2REMAP) by incorporating clinical semantics learned from electronic health records (EHR) of 12.6 million patients. Specifically, M2REMAP first learns a multimodal molecule representation that synthesizes chemical property and clinical semantic information by mapping molecule chemicals via a deep neural network onto the clinical semantic embedding space shared by drugs, diseases and other common clinical concepts. To infer molecule-disease relations, M2REMAP combines multimodal molecule representation and disease semantic embedding to jointly infer indications and side effects. We extensively evaluate M2REMAP on molecule indications, side effects and interactions. Results show that incorporating EHR embeddings improves performance significantly, for example, attaining an improvement over the baseline models by 23.6% in PRC-AUC on indications and 23.9% on side effects. Further, M2REMAP overcomes the limitation of existing methods and effectively predicts drugs for novel diseases and emerging pathogens. The code is available at https://github.com/celehs/M2REMAP, and prediction results are provided at https://shiny.parse-health.org/drugs-diseases-dev/. Supplementary data are available at Bioinformatics online.
ISSN:1367-4811
1367-4803
1367-4811
DOI:10.1093/bioinformatics/btad085