Loading…

Multiview Incomplete Knowledge Graph Integration with application to cross-institutional EHR data harmonization

The growing availability of electronic health records (EHR) data opens opportunities for integrative analysis of multi-institutional EHR to produce generalizable knowledge. A key barrier to such integrative analyses is the lack of semantic interoperability across different institutions due to coding...

Full description

Saved in:
Bibliographic Details
Published in:Journal of biomedical informatics 2022-09, Vol.133, p.104147-104147, Article 104147
Main Authors: Zhou, Doudou, Gan, Ziming, Shi, Xu, Patwari, Alina, Rush, Everett, Bonzel, Clara-Lea, Panickan, Vidul A., Hong, Chuan, Ho, Yuk-Lam, Cai, Tianrun, Costa, Lauren, Li, Xiaoou, Castro, Victor M., Murphy, Shawn N., Brat, Gabriel, Weber, Griffin, Avillach, Paul, Gaziano, J. Michael, Cho, Kelly, Liao, Katherine P., Lu, Junwei, Cai, Tianxi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c373t-fffe3710a8b5e98c88dd408419a6d29e29e9c5a46ca48019b0d1197b5c4ed04c3
cites cdi_FETCH-LOGICAL-c373t-fffe3710a8b5e98c88dd408419a6d29e29e9c5a46ca48019b0d1197b5c4ed04c3
container_end_page 104147
container_issue
container_start_page 104147
container_title Journal of biomedical informatics
container_volume 133
creator Zhou, Doudou
Gan, Ziming
Shi, Xu
Patwari, Alina
Rush, Everett
Bonzel, Clara-Lea
Panickan, Vidul A.
Hong, Chuan
Ho, Yuk-Lam
Cai, Tianrun
Costa, Lauren
Li, Xiaoou
Castro, Victor M.
Murphy, Shawn N.
Brat, Gabriel
Weber, Griffin
Avillach, Paul
Gaziano, J. Michael
Cho, Kelly
Liao, Katherine P.
Lu, Junwei
Cai, Tianxi
description The growing availability of electronic health records (EHR) data opens opportunities for integrative analysis of multi-institutional EHR to produce generalizable knowledge. A key barrier to such integrative analyses is the lack of semantic interoperability across different institutions due to coding differences. We propose a Multiview Incomplete Knowledge Graph Integration (MIKGI) algorithm to integrate information from multiple sources with partially overlapping EHR concept codes to enable translations between healthcare systems. The MIKGI algorithm combines knowledge graph information from (i) embeddings trained from the co-occurrence patterns of medical codes within each EHR system and (ii) semantic embeddings of the textual strings of all medical codes obtained from the Self-Aligning Pretrained BERT (SAPBERT) algorithm. Due to the heterogeneity in the coding across healthcare systems, each EHR source provides partial coverage of the available codes. MIKGI synthesizes the incomplete knowledge graphs derived from these multi-source embeddings by minimizing a spherical loss function that combines the pairwise directional similarities of embeddings computed from all available sources. MIKGI outputs harmonized semantic embedding vectors for all EHR codes, which improves the quality of the embeddings and enables direct assessment of both similarity and relatedness between any pair of codes from multiple healthcare systems. With EHR co-occurrence data from Veteran Affairs (VA) healthcare and Mass General Brigham (MGB), MIKGI algorithm produces high quality embeddings for a variety of downstream tasks including detecting known similar or related entity pairs and mapping VA local codes to the relevant EHR codes used at MGB. Based on the cosine similarity of the MIKGI trained embeddings, the AUC was 0.918 for detecting similar entity pairs and 0.809 for detecting related pairs. For cross-institutional medical code mapping, the top 1 and top 5 accuracy were 91.0% and 97.5% when mapping medication codes at VA to RxNorm medication codes at MGB; 59.1% and 75.8% when mapping VA local laboratory codes to LOINC hierarchy. When trained with 500 labels, the lab code mapping attained top 1 and 5 accuracy at 77.7% and 87.9%. MIKGI also attained best performance in selecting VA local lab codes for desired laboratory tests and COVID-19 related features for COVID EHR studies. Compared to existing methods, MIKGI attained the most robust performance with accuracy the highest or
doi_str_mv 10.1016/j.jbi.2022.104147
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2694415102</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1532046422001599</els_id><sourcerecordid>2694415102</sourcerecordid><originalsourceid>FETCH-LOGICAL-c373t-fffe3710a8b5e98c88dd408419a6d29e29e9c5a46ca48019b0d1197b5c4ed04c3</originalsourceid><addsrcrecordid>eNp9kF9LwzAUxYMoOKcfwLc8-tKZtOk_fJKh23AiiD6HNLndUrqmJqlDP73ZKj4KF3Jzck7g_BC6pmRGCc1um1lT6VlM4jjcGWX5CZrQNIkjwgpy-rdn7BxdONcQQmmaZhNknofW608Ne7zqpNn1LXjAT53Zt6A2gBdW9Nvw5GFjhdemw3vtt1j0favlKHiDpTXORbpzXvvhIIoWPyxfsRJe4K2wO9Pp76P7Ep3VonVw9XtO0fvjw9t8Ga1fFqv5_TqSSZ74qK5rSHJKRFGlUBayKJRipGC0FJmKSwhTylSwTIrQj5YVUZSWeZVKBoowmUzRzfhvb83HAM7znXYS2lZ0YAbH46xkjKaUxMFKR-uxhYWa91bvhP3ilPADXN7wAJcf4PIRbsjcjRkIHQI9y53U0ElQ2oL0XBn9T_oHqEWEKw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2694415102</pqid></control><display><type>article</type><title>Multiview Incomplete Knowledge Graph Integration with application to cross-institutional EHR data harmonization</title><source>Elsevier</source><creator>Zhou, Doudou ; Gan, Ziming ; Shi, Xu ; Patwari, Alina ; Rush, Everett ; Bonzel, Clara-Lea ; Panickan, Vidul A. ; Hong, Chuan ; Ho, Yuk-Lam ; Cai, Tianrun ; Costa, Lauren ; Li, Xiaoou ; Castro, Victor M. ; Murphy, Shawn N. ; Brat, Gabriel ; Weber, Griffin ; Avillach, Paul ; Gaziano, J. Michael ; Cho, Kelly ; Liao, Katherine P. ; Lu, Junwei ; Cai, Tianxi</creator><creatorcontrib>Zhou, Doudou ; Gan, Ziming ; Shi, Xu ; Patwari, Alina ; Rush, Everett ; Bonzel, Clara-Lea ; Panickan, Vidul A. ; Hong, Chuan ; Ho, Yuk-Lam ; Cai, Tianrun ; Costa, Lauren ; Li, Xiaoou ; Castro, Victor M. ; Murphy, Shawn N. ; Brat, Gabriel ; Weber, Griffin ; Avillach, Paul ; Gaziano, J. Michael ; Cho, Kelly ; Liao, Katherine P. ; Lu, Junwei ; Cai, Tianxi</creatorcontrib><description>The growing availability of electronic health records (EHR) data opens opportunities for integrative analysis of multi-institutional EHR to produce generalizable knowledge. A key barrier to such integrative analyses is the lack of semantic interoperability across different institutions due to coding differences. We propose a Multiview Incomplete Knowledge Graph Integration (MIKGI) algorithm to integrate information from multiple sources with partially overlapping EHR concept codes to enable translations between healthcare systems. The MIKGI algorithm combines knowledge graph information from (i) embeddings trained from the co-occurrence patterns of medical codes within each EHR system and (ii) semantic embeddings of the textual strings of all medical codes obtained from the Self-Aligning Pretrained BERT (SAPBERT) algorithm. Due to the heterogeneity in the coding across healthcare systems, each EHR source provides partial coverage of the available codes. MIKGI synthesizes the incomplete knowledge graphs derived from these multi-source embeddings by minimizing a spherical loss function that combines the pairwise directional similarities of embeddings computed from all available sources. MIKGI outputs harmonized semantic embedding vectors for all EHR codes, which improves the quality of the embeddings and enables direct assessment of both similarity and relatedness between any pair of codes from multiple healthcare systems. With EHR co-occurrence data from Veteran Affairs (VA) healthcare and Mass General Brigham (MGB), MIKGI algorithm produces high quality embeddings for a variety of downstream tasks including detecting known similar or related entity pairs and mapping VA local codes to the relevant EHR codes used at MGB. Based on the cosine similarity of the MIKGI trained embeddings, the AUC was 0.918 for detecting similar entity pairs and 0.809 for detecting related pairs. For cross-institutional medical code mapping, the top 1 and top 5 accuracy were 91.0% and 97.5% when mapping medication codes at VA to RxNorm medication codes at MGB; 59.1% and 75.8% when mapping VA local laboratory codes to LOINC hierarchy. When trained with 500 labels, the lab code mapping attained top 1 and 5 accuracy at 77.7% and 87.9%. MIKGI also attained best performance in selecting VA local lab codes for desired laboratory tests and COVID-19 related features for COVID EHR studies. Compared to existing methods, MIKGI attained the most robust performance with accuracy the highest or near the highest across all tasks. The proposed MIKGI algorithm can effectively integrate incomplete summary data from biomedical text and EHR data to generate harmonized embeddings for EHR codes for knowledge graph modeling and cross-institutional translation of EHR codes. [Display omitted] •Knowledge graph completion approach to enable cross-institutional data harmonization.•Semantic representation learning for partially overlapping multi-institutional codes.•Improved representation by integrating EHR and textual data.</description><identifier>ISSN: 1532-0464</identifier><identifier>EISSN: 1532-0480</identifier><identifier>DOI: 10.1016/j.jbi.2022.104147</identifier><language>eng</language><publisher>Elsevier Inc</publisher><subject>Code mapping ; Knowledge graph ; PMI matrix ; Transfer learning ; Word embedding</subject><ispartof>Journal of biomedical informatics, 2022-09, Vol.133, p.104147-104147, Article 104147</ispartof><rights>2022 The Authors</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c373t-fffe3710a8b5e98c88dd408419a6d29e29e9c5a46ca48019b0d1197b5c4ed04c3</citedby><cites>FETCH-LOGICAL-c373t-fffe3710a8b5e98c88dd408419a6d29e29e9c5a46ca48019b0d1197b5c4ed04c3</cites><orcidid>0000-0002-4797-3200 ; 0000-0001-5238-1413 ; 0000-0002-5379-2502 ; 0000-0003-0616-0403 ; 0000-0001-8566-9552 ; 0000-0002-5632-5723 ; 0000-0001-7237-5336</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Zhou, Doudou</creatorcontrib><creatorcontrib>Gan, Ziming</creatorcontrib><creatorcontrib>Shi, Xu</creatorcontrib><creatorcontrib>Patwari, Alina</creatorcontrib><creatorcontrib>Rush, Everett</creatorcontrib><creatorcontrib>Bonzel, Clara-Lea</creatorcontrib><creatorcontrib>Panickan, Vidul A.</creatorcontrib><creatorcontrib>Hong, Chuan</creatorcontrib><creatorcontrib>Ho, Yuk-Lam</creatorcontrib><creatorcontrib>Cai, Tianrun</creatorcontrib><creatorcontrib>Costa, Lauren</creatorcontrib><creatorcontrib>Li, Xiaoou</creatorcontrib><creatorcontrib>Castro, Victor M.</creatorcontrib><creatorcontrib>Murphy, Shawn N.</creatorcontrib><creatorcontrib>Brat, Gabriel</creatorcontrib><creatorcontrib>Weber, Griffin</creatorcontrib><creatorcontrib>Avillach, Paul</creatorcontrib><creatorcontrib>Gaziano, J. Michael</creatorcontrib><creatorcontrib>Cho, Kelly</creatorcontrib><creatorcontrib>Liao, Katherine P.</creatorcontrib><creatorcontrib>Lu, Junwei</creatorcontrib><creatorcontrib>Cai, Tianxi</creatorcontrib><title>Multiview Incomplete Knowledge Graph Integration with application to cross-institutional EHR data harmonization</title><title>Journal of biomedical informatics</title><description>The growing availability of electronic health records (EHR) data opens opportunities for integrative analysis of multi-institutional EHR to produce generalizable knowledge. A key barrier to such integrative analyses is the lack of semantic interoperability across different institutions due to coding differences. We propose a Multiview Incomplete Knowledge Graph Integration (MIKGI) algorithm to integrate information from multiple sources with partially overlapping EHR concept codes to enable translations between healthcare systems. The MIKGI algorithm combines knowledge graph information from (i) embeddings trained from the co-occurrence patterns of medical codes within each EHR system and (ii) semantic embeddings of the textual strings of all medical codes obtained from the Self-Aligning Pretrained BERT (SAPBERT) algorithm. Due to the heterogeneity in the coding across healthcare systems, each EHR source provides partial coverage of the available codes. MIKGI synthesizes the incomplete knowledge graphs derived from these multi-source embeddings by minimizing a spherical loss function that combines the pairwise directional similarities of embeddings computed from all available sources. MIKGI outputs harmonized semantic embedding vectors for all EHR codes, which improves the quality of the embeddings and enables direct assessment of both similarity and relatedness between any pair of codes from multiple healthcare systems. With EHR co-occurrence data from Veteran Affairs (VA) healthcare and Mass General Brigham (MGB), MIKGI algorithm produces high quality embeddings for a variety of downstream tasks including detecting known similar or related entity pairs and mapping VA local codes to the relevant EHR codes used at MGB. Based on the cosine similarity of the MIKGI trained embeddings, the AUC was 0.918 for detecting similar entity pairs and 0.809 for detecting related pairs. For cross-institutional medical code mapping, the top 1 and top 5 accuracy were 91.0% and 97.5% when mapping medication codes at VA to RxNorm medication codes at MGB; 59.1% and 75.8% when mapping VA local laboratory codes to LOINC hierarchy. When trained with 500 labels, the lab code mapping attained top 1 and 5 accuracy at 77.7% and 87.9%. MIKGI also attained best performance in selecting VA local lab codes for desired laboratory tests and COVID-19 related features for COVID EHR studies. Compared to existing methods, MIKGI attained the most robust performance with accuracy the highest or near the highest across all tasks. The proposed MIKGI algorithm can effectively integrate incomplete summary data from biomedical text and EHR data to generate harmonized embeddings for EHR codes for knowledge graph modeling and cross-institutional translation of EHR codes. [Display omitted] •Knowledge graph completion approach to enable cross-institutional data harmonization.•Semantic representation learning for partially overlapping multi-institutional codes.•Improved representation by integrating EHR and textual data.</description><subject>Code mapping</subject><subject>Knowledge graph</subject><subject>PMI matrix</subject><subject>Transfer learning</subject><subject>Word embedding</subject><issn>1532-0464</issn><issn>1532-0480</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp9kF9LwzAUxYMoOKcfwLc8-tKZtOk_fJKh23AiiD6HNLndUrqmJqlDP73ZKj4KF3Jzck7g_BC6pmRGCc1um1lT6VlM4jjcGWX5CZrQNIkjwgpy-rdn7BxdONcQQmmaZhNknofW608Ne7zqpNn1LXjAT53Zt6A2gBdW9Nvw5GFjhdemw3vtt1j0favlKHiDpTXORbpzXvvhIIoWPyxfsRJe4K2wO9Pp76P7Ep3VonVw9XtO0fvjw9t8Ga1fFqv5_TqSSZ74qK5rSHJKRFGlUBayKJRipGC0FJmKSwhTylSwTIrQj5YVUZSWeZVKBoowmUzRzfhvb83HAM7znXYS2lZ0YAbH46xkjKaUxMFKR-uxhYWa91bvhP3ilPADXN7wAJcf4PIRbsjcjRkIHQI9y53U0ElQ2oL0XBn9T_oHqEWEKw</recordid><startdate>202209</startdate><enddate>202209</enddate><creator>Zhou, Doudou</creator><creator>Gan, Ziming</creator><creator>Shi, Xu</creator><creator>Patwari, Alina</creator><creator>Rush, Everett</creator><creator>Bonzel, Clara-Lea</creator><creator>Panickan, Vidul A.</creator><creator>Hong, Chuan</creator><creator>Ho, Yuk-Lam</creator><creator>Cai, Tianrun</creator><creator>Costa, Lauren</creator><creator>Li, Xiaoou</creator><creator>Castro, Victor M.</creator><creator>Murphy, Shawn N.</creator><creator>Brat, Gabriel</creator><creator>Weber, Griffin</creator><creator>Avillach, Paul</creator><creator>Gaziano, J. Michael</creator><creator>Cho, Kelly</creator><creator>Liao, Katherine P.</creator><creator>Lu, Junwei</creator><creator>Cai, Tianxi</creator><general>Elsevier Inc</general><scope>6I.</scope><scope>AAFTH</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-4797-3200</orcidid><orcidid>https://orcid.org/0000-0001-5238-1413</orcidid><orcidid>https://orcid.org/0000-0002-5379-2502</orcidid><orcidid>https://orcid.org/0000-0003-0616-0403</orcidid><orcidid>https://orcid.org/0000-0001-8566-9552</orcidid><orcidid>https://orcid.org/0000-0002-5632-5723</orcidid><orcidid>https://orcid.org/0000-0001-7237-5336</orcidid></search><sort><creationdate>202209</creationdate><title>Multiview Incomplete Knowledge Graph Integration with application to cross-institutional EHR data harmonization</title><author>Zhou, Doudou ; Gan, Ziming ; Shi, Xu ; Patwari, Alina ; Rush, Everett ; Bonzel, Clara-Lea ; Panickan, Vidul A. ; Hong, Chuan ; Ho, Yuk-Lam ; Cai, Tianrun ; Costa, Lauren ; Li, Xiaoou ; Castro, Victor M. ; Murphy, Shawn N. ; Brat, Gabriel ; Weber, Griffin ; Avillach, Paul ; Gaziano, J. Michael ; Cho, Kelly ; Liao, Katherine P. ; Lu, Junwei ; Cai, Tianxi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c373t-fffe3710a8b5e98c88dd408419a6d29e29e9c5a46ca48019b0d1197b5c4ed04c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Code mapping</topic><topic>Knowledge graph</topic><topic>PMI matrix</topic><topic>Transfer learning</topic><topic>Word embedding</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Doudou</creatorcontrib><creatorcontrib>Gan, Ziming</creatorcontrib><creatorcontrib>Shi, Xu</creatorcontrib><creatorcontrib>Patwari, Alina</creatorcontrib><creatorcontrib>Rush, Everett</creatorcontrib><creatorcontrib>Bonzel, Clara-Lea</creatorcontrib><creatorcontrib>Panickan, Vidul A.</creatorcontrib><creatorcontrib>Hong, Chuan</creatorcontrib><creatorcontrib>Ho, Yuk-Lam</creatorcontrib><creatorcontrib>Cai, Tianrun</creatorcontrib><creatorcontrib>Costa, Lauren</creatorcontrib><creatorcontrib>Li, Xiaoou</creatorcontrib><creatorcontrib>Castro, Victor M.</creatorcontrib><creatorcontrib>Murphy, Shawn N.</creatorcontrib><creatorcontrib>Brat, Gabriel</creatorcontrib><creatorcontrib>Weber, Griffin</creatorcontrib><creatorcontrib>Avillach, Paul</creatorcontrib><creatorcontrib>Gaziano, J. Michael</creatorcontrib><creatorcontrib>Cho, Kelly</creatorcontrib><creatorcontrib>Liao, Katherine P.</creatorcontrib><creatorcontrib>Lu, Junwei</creatorcontrib><creatorcontrib>Cai, Tianxi</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Journal of biomedical informatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhou, Doudou</au><au>Gan, Ziming</au><au>Shi, Xu</au><au>Patwari, Alina</au><au>Rush, Everett</au><au>Bonzel, Clara-Lea</au><au>Panickan, Vidul A.</au><au>Hong, Chuan</au><au>Ho, Yuk-Lam</au><au>Cai, Tianrun</au><au>Costa, Lauren</au><au>Li, Xiaoou</au><au>Castro, Victor M.</au><au>Murphy, Shawn N.</au><au>Brat, Gabriel</au><au>Weber, Griffin</au><au>Avillach, Paul</au><au>Gaziano, J. Michael</au><au>Cho, Kelly</au><au>Liao, Katherine P.</au><au>Lu, Junwei</au><au>Cai, Tianxi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multiview Incomplete Knowledge Graph Integration with application to cross-institutional EHR data harmonization</atitle><jtitle>Journal of biomedical informatics</jtitle><date>2022-09</date><risdate>2022</risdate><volume>133</volume><spage>104147</spage><epage>104147</epage><pages>104147-104147</pages><artnum>104147</artnum><issn>1532-0464</issn><eissn>1532-0480</eissn><abstract>The growing availability of electronic health records (EHR) data opens opportunities for integrative analysis of multi-institutional EHR to produce generalizable knowledge. A key barrier to such integrative analyses is the lack of semantic interoperability across different institutions due to coding differences. We propose a Multiview Incomplete Knowledge Graph Integration (MIKGI) algorithm to integrate information from multiple sources with partially overlapping EHR concept codes to enable translations between healthcare systems. The MIKGI algorithm combines knowledge graph information from (i) embeddings trained from the co-occurrence patterns of medical codes within each EHR system and (ii) semantic embeddings of the textual strings of all medical codes obtained from the Self-Aligning Pretrained BERT (SAPBERT) algorithm. Due to the heterogeneity in the coding across healthcare systems, each EHR source provides partial coverage of the available codes. MIKGI synthesizes the incomplete knowledge graphs derived from these multi-source embeddings by minimizing a spherical loss function that combines the pairwise directional similarities of embeddings computed from all available sources. MIKGI outputs harmonized semantic embedding vectors for all EHR codes, which improves the quality of the embeddings and enables direct assessment of both similarity and relatedness between any pair of codes from multiple healthcare systems. With EHR co-occurrence data from Veteran Affairs (VA) healthcare and Mass General Brigham (MGB), MIKGI algorithm produces high quality embeddings for a variety of downstream tasks including detecting known similar or related entity pairs and mapping VA local codes to the relevant EHR codes used at MGB. Based on the cosine similarity of the MIKGI trained embeddings, the AUC was 0.918 for detecting similar entity pairs and 0.809 for detecting related pairs. For cross-institutional medical code mapping, the top 1 and top 5 accuracy were 91.0% and 97.5% when mapping medication codes at VA to RxNorm medication codes at MGB; 59.1% and 75.8% when mapping VA local laboratory codes to LOINC hierarchy. When trained with 500 labels, the lab code mapping attained top 1 and 5 accuracy at 77.7% and 87.9%. MIKGI also attained best performance in selecting VA local lab codes for desired laboratory tests and COVID-19 related features for COVID EHR studies. Compared to existing methods, MIKGI attained the most robust performance with accuracy the highest or near the highest across all tasks. The proposed MIKGI algorithm can effectively integrate incomplete summary data from biomedical text and EHR data to generate harmonized embeddings for EHR codes for knowledge graph modeling and cross-institutional translation of EHR codes. [Display omitted] •Knowledge graph completion approach to enable cross-institutional data harmonization.•Semantic representation learning for partially overlapping multi-institutional codes.•Improved representation by integrating EHR and textual data.</abstract><pub>Elsevier Inc</pub><doi>10.1016/j.jbi.2022.104147</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-4797-3200</orcidid><orcidid>https://orcid.org/0000-0001-5238-1413</orcidid><orcidid>https://orcid.org/0000-0002-5379-2502</orcidid><orcidid>https://orcid.org/0000-0003-0616-0403</orcidid><orcidid>https://orcid.org/0000-0001-8566-9552</orcidid><orcidid>https://orcid.org/0000-0002-5632-5723</orcidid><orcidid>https://orcid.org/0000-0001-7237-5336</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1532-0464
ispartof Journal of biomedical informatics, 2022-09, Vol.133, p.104147-104147, Article 104147
issn 1532-0464
1532-0480
language eng
recordid cdi_proquest_miscellaneous_2694415102
source Elsevier
subjects Code mapping
Knowledge graph
PMI matrix
Transfer learning
Word embedding
title Multiview Incomplete Knowledge Graph Integration with application to cross-institutional EHR data harmonization
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T16%3A28%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multiview%20Incomplete%20Knowledge%20Graph%20Integration%20with%20application%20to%20cross-institutional%20EHR%20data%20harmonization&rft.jtitle=Journal%20of%20biomedical%20informatics&rft.au=Zhou,%20Doudou&rft.date=2022-09&rft.volume=133&rft.spage=104147&rft.epage=104147&rft.pages=104147-104147&rft.artnum=104147&rft.issn=1532-0464&rft.eissn=1532-0480&rft_id=info:doi/10.1016/j.jbi.2022.104147&rft_dat=%3Cproquest_cross%3E2694415102%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c373t-fffe3710a8b5e98c88dd408419a6d29e29e9c5a46ca48019b0d1197b5c4ed04c3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2694415102&rft_id=info:pmid/&rfr_iscdi=true