Loading…
Applying Machine Learning Algorithms to Segment High-Cost Patient Populations
Background Efforts to improve the value of care for high-cost patients may benefit from care management strategies targeted at clinically distinct subgroups of patients. Objective To evaluate the performance of three different machine learning algorithms for identifying subgroups of high-cost patien...
Saved in:
Published in: | Journal of general internal medicine : JGIM 2019-02, Vol.34 (2), p.211-217 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Background
Efforts to improve the value of care for high-cost patients may benefit from care management strategies targeted at clinically distinct subgroups of patients.
Objective
To evaluate the performance of three different machine learning algorithms for identifying subgroups of high-cost patients.
Design
We applied three different clustering algorithms—connectivity-based clustering using agglomerative hierarchical clustering, centroid-based clustering with the k-medoids algorithm, and density-based clustering with the OPTICS algorithm—to a clinical and administrative dataset. We then examined the extent to which each algorithm identified subgroups of patients that were (1) clinically distinct and (2) associated with meaningful differences in relevant utilization metrics.
Participants
Patients enrolled in a national Medicare Advantage plan, categorized in the top decile of spending (
n
= 6154).
Main Measures
Post hoc discriminative models comparing the importance of variables for distinguishing observations in one cluster from the rest. Variance in utilization and spending measures.
Key Results
Connectivity-based, centroid-based, and density-based clustering identified eight, five, and ten subgroups of high-cost patients, respectively. Post hoc discriminative models indicated that density-based clustering subgroups were the most clinically distinct. The variance of utilization and spending measures was the greatest among the subgroups identified through density-based clustering.
Conclusions
Machine learning algorithms can be used to segment a high-cost patient population into subgroups of patients that are clinically distinct and associated with meaningful differences in utilization and spending measures. For these purposes, density-based clustering with the OPTICS algorithm outperformed connectivity-based and centroid-based clustering algorithms. |
---|---|
ISSN: | 0884-8734 1525-1497 |
DOI: | 10.1007/s11606-018-4760-8 |