Loading…

Using Next Generation Sequencing of Flow Cytometry CD Markers and Machine Learning As a Replacement to Flow Cytometry Analysis for the Diagnosis of Hematologic Neoplasms

Introduction: Flow cytometry performs multi-parameter analysis of cells and analyzes surface and intracellular markers for accurate phenotypic characterization of a cell population. Flow cytometry is used extensively in the diagnosis and classification of various hematologic neoplasms. However, anal...

Full description

Saved in:
Bibliographic Details
Published in:Blood 2023-11, Vol.142 (Supplement 1), p.3654-3654
Main Authors: Albitar, Maher, Zhang, Hong, Ip, Andrew, Ma, Wanlong, McCloskey, James, Linder, Katherine, Estella, Jeffrey Justin, Koprivnikar, Jamie, Biran, Noa, Siegel, David S., Charifa, Ahmad, Mohtashamian, Arash, Pecora, Andrew L, Goy, Andre
Format: Article
Language:English
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Introduction: Flow cytometry performs multi-parameter analysis of cells and analyzes surface and intracellular markers for accurate phenotypic characterization of a cell population. Flow cytometry is used extensively in the diagnosis and classification of various hematologic neoplasms. However, analysis of the generated data is time consuming and remains subjective, requiring special skill and experience. Furthermore, some diagnostic classes, such as myeloproliferative neoplasms (MPN) and myelodysplastic syndrome (MDS), are difficult to diagnose using flow cytometry. The RNA levels of the CD markers used in flow cytometry can be reliably quantified using next generation sequencing (NGS). However, when all cells are jointly sequenced, studying subpopulation of cells is lost, which hinders accurate diagnosis. However, machine learning algorithms are capable of multi-marker normalizing and compensate for the loss of subclonal analysis. To validate this assumption, we explored the potential of using the RNA levels of 30 CD markers along with a machine learning algorithm in the differential diagnosis between various types of hematologic neoplasms. Methods: RNA was extracted from fresh bone marrow and peripheral blood samples from 172 acute myeloid leukemia (AML), 369 normal control, 68 MPN, 218 MDS, 93 acute lymphoblastic leukemia (ALL), 74 chronic lymphocytic leukemia (CLL), 38 mantle cell lymphoma, and 83 multiple myeloma cases. The samples were consecutive and collected without selection. RNA sequencing was performed using a targeted hybrid capture panel that included CD1A, CD2, CD3D, CD3E, CD3G, CD4, CD5, CD7, CD8A, CD8B, CD10, CD14, CD19, CD20, CD22, CD33, CD34, CD38, CD40, CD44, CD47, CD68, CD70, CD74, CD79A, CD79B, CD81, CD138, CD200, and CD274 genes. Salmon v1.4.0 software was used for expression quantification (TPM). Machine learning algorithm (random forest) was used for classifying diseases. Two thirds of samples were used for training the random forest algorithm and one third was used for testing. Results: While frequently a diagnosis can be made by simply inspecting the RNA levels of various CD markers, machine learning is needed when the fraction of the neoplastic cells is low. Using machine learning (random forest), diagnosis of most hematologic neoplasms was achieved with high sensitivity and specificity in the testing set. Area under the curve (AUC) was at 0.972 (95% CI: 0.950-0.994) for AML vs. normal, 0.936 (95% CI: 0.898-0.974) for normal vs
ISSN:0006-4971
1528-0020
DOI:10.1182/blood-2023-187041