Loading…

Abstract 3677: Scalable assembly of individual patient profiles for clinical trials accrual and research

As clinical data are digitized in electronic medical records (EMR), the amount of historical data becomes a challenge for utilization and outcomes studies. An automated approach for structuring clinical data into machine-readable format is essential due to scale. We previously developed MMPower, an...

Full description

Saved in:
Bibliographic Details
Published in:Cancer research (Chicago, Ill.) Ill.), 2018-07, Vol.78 (13_Supplement), p.3677-3677
Main Authors: Duren, Ryan, Smith, Ryan, Tackes, Nick, Neeley, Shane, Welsh, James, Li, Xuan Shirley
Format: Article
Language:English
Citations: Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:As clinical data are digitized in electronic medical records (EMR), the amount of historical data becomes a challenge for utilization and outcomes studies. An automated approach for structuring clinical data into machine-readable format is essential due to scale. We previously developed MMPower, an ElasticSearch-based technology platform for named entity extraction defined by SNOMED conditions, >2,000 Entrez Genes, >900,000 cancer alterations, and ~7,000 FDA-approved and experimental therapeutics/progression that enables clinical trials search. In a pilot study, MMPower was applied to characterize and index oncology EMR for elucidating clinical histories and identifying clinical trial participants. 52,509 individual EMR were collated by Medical Record Number (MRN) across facets including Demographics, Providers/Sites, Diagnosis, Medication, Imaging, Lab results, Pathology, Performance status, and Encounter with Physician Notes. Applying MMPower yielded a total cohort of 43,987 (83.8% of EMR received) EMR with searchable, structured metadata profiles. We found that 39,317 (89.4%) of the total cohort had a cancer diagnosis as defined by SNOMED cancer types. We also extracted an Entrez gene from 36,152 (82.2%) EMR, while cancer alterations or stratification biomarkers such as ER/PR status, ALK fusions, IGH-BCL1, EGFR exon 19 deletion, CDKN2A loss, etc., were identified in 25,208 (57.3%) EMR. Furthermore, we found that the majority of EMR (69.7%) with identifiable genes also contained a cancer alteration. Interestingly, although most EMR (39,615 or 90% of total cohort) held identifiable cancer therapeutics, only 1.7% of the total cohort were flagged as cases of progression. We then determined the feasibility of matching EMR to clinical trials and found that, strikingly, 16.7% of the total cohort were potentially eligible for institution-specific clinical trials. Finally, we sought to enrich individual EMR with molecular alterations data from external lab reports, so we developed a PDF to JSON transform that enables mapping of report results by MRN to our cancer alterations model, yielding a single searchable record from multiple data sources. These results suggest that the ability to structure machine-readable clinical and molecular data for individuals from EMR at scale would accelerate translational research by promoting more efficient clinical trials accrual and could be extended to prevalence calculations, evidence-backed treatment guidance through molecul
ISSN:0008-5472
1538-7445
DOI:10.1158/1538-7445.AM2018-3677