Loading…

Applying Machine Learning Algorithms to Segment High-Cost Patient Populations

Background Efforts to improve the value of care for high-cost patients may benefit from care management strategies targeted at clinically distinct subgroups of patients. Objective To evaluate the performance of three different machine learning algorithms for identifying subgroups of high-cost patien...

Full description

Saved in:
Bibliographic Details
Published in:Journal of general internal medicine : JGIM 2019-02, Vol.34 (2), p.211-217
Main Authors: Yan, Jiali, Linn, Kristin A., Powers, Brian W., Zhu, Jingsan, Jain, Sachin H., Kowalski, Jennifer L., Navathe, Amol S.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Background Efforts to improve the value of care for high-cost patients may benefit from care management strategies targeted at clinically distinct subgroups of patients. Objective To evaluate the performance of three different machine learning algorithms for identifying subgroups of high-cost patients. Design We applied three different clustering algorithms—connectivity-based clustering using agglomerative hierarchical clustering, centroid-based clustering with the k-medoids algorithm, and density-based clustering with the OPTICS algorithm—to a clinical and administrative dataset. We then examined the extent to which each algorithm identified subgroups of patients that were (1) clinically distinct and (2) associated with meaningful differences in relevant utilization metrics. Participants Patients enrolled in a national Medicare Advantage plan, categorized in the top decile of spending ( n  = 6154). Main Measures Post hoc discriminative models comparing the importance of variables for distinguishing observations in one cluster from the rest. Variance in utilization and spending measures. Key Results Connectivity-based, centroid-based, and density-based clustering identified eight, five, and ten subgroups of high-cost patients, respectively. Post hoc discriminative models indicated that density-based clustering subgroups were the most clinically distinct. The variance of utilization and spending measures was the greatest among the subgroups identified through density-based clustering. Conclusions Machine learning algorithms can be used to segment a high-cost patient population into subgroups of patients that are clinically distinct and associated with meaningful differences in utilization and spending measures. For these purposes, density-based clustering with the OPTICS algorithm outperformed connectivity-based and centroid-based clustering algorithms.
ISSN:0884-8734
1525-1497
DOI:10.1007/s11606-018-4760-8