Loading…

A Bibliographic Dataset of Health Artificial Intelligence Research

The aim of this study is to construct a curated bibliographic dataset for a landscape analysis on Health Artificial Intelligence (HAI) research. We integrated HAI-related bibliographic records, including publications, open research datasets, patents, research grants, and clinical trials from Medline...

Full description

Saved in:
Bibliographic Details
Published in:Health data science 2024, Vol.4, p.0125-0125
Main Authors: Shi, Xuanyu, Yin, Daoxin, Bai, Yongmei, Zhao, Wenjing, Guo, Xin, Sun, Huage, Cui, Dongliang, Du, Jian
Format: Article
Language:English
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The aim of this study is to construct a curated bibliographic dataset for a landscape analysis on Health Artificial Intelligence (HAI) research. We integrated HAI-related bibliographic records, including publications, open research datasets, patents, research grants, and clinical trials from Medline and Dimensions. Searching: Relevant documents were identified using Medical Subject Headings (MeSH) and Field of Research (FoR) indexed by 2 bibliographic databases, Medline and Dimensions. Extracting: MeSH terms annotated from the aforementioned bibliographic databases served as the primary information for our processing. For document records lacking MeSH terms, we re-extracted them using the Medical Text Indexer (MTI). Mapping: In order to enhance interoperability, HAI multi-documents were organized using a mapping system incorporating MeSH, FoR, The International Classification of Diseases (ICD-10), and Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT). Integrating: All documents were curated based on a pre-defined ontology of health problems and AI technologies from the MeSH hierarchy. We collected 96,332 HAI documents (publications: 75,820, open research datasets: 638, patents: 11,226, grants: 6,113, and clinical trials: 2,535) during 2009 to 2021. On average, 75.12% of the documents were tagged with at least one label related to either health problems or AI technologies (with 92.9% of publications tagged). This study presents a comprehensive pipeline for processing and curating HAI bibliographic documents following the FAIR (Findable, Accessible, Interoperable, Reusable) standard, offering a valuable multidimensional collection for the community. This dataset serves as a crucial resource for horizontally scanning the funding, research, clinical assessments, and innovations within the HAI field.
ISSN:2765-8783
2765-8783
DOI:10.34133/hds.0125