Loading…

Facilitating phenotyping from clinical texts: the medkit library

Phenotyping consists in applying algorithms to identify individuals associated with a specific, potentially complex, trait or condition, typically out of a collection of Electronic Health Records (EHRs). Because a lot of the clinical information of EHRs are lying in texts, phenotyping from text take...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics (Oxford, England) England), 2024-11, Vol.40 (12)
Main Authors: Neuraz, Antoine, Vaillant, Ghislain, Arias, Camila, Birot, Olivier, Huynh, Kim-Tam, Fabacher, Thibaut, Rogier, Alice, Garcelon, Nicolas, Lerner, Ivan, Rance, Bastien, Coulet, Adrien
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Phenotyping consists in applying algorithms to identify individuals associated with a specific, potentially complex, trait or condition, typically out of a collection of Electronic Health Records (EHRs). Because a lot of the clinical information of EHRs are lying in texts, phenotyping from text takes an important role in studies that rely on the secondary use of EHRs. However, the heterogeneity and highly specialized aspect of both the content and form of clinical texts makes this task particularly tedious, and is the source of time and cost constraints in observational studies. To facilitate the development, evaluation and reproducibility of phenotyping pipelines, we developed an open-source Python library named medkit. It enables composing data processing pipelines made of easy-to-reuse software bricks, named medkit operations. In addition to the core of the library, we share the operations and pipelines we already developed and invite the phenotyping community for their reuse and enrichment. medkit is available at https://github.com/medkit-lib/medkit.
ISSN:1367-4811
1367-4803
1367-4811
DOI:10.1093/bioinformatics/btae681