Loading…

Integration of single‐cell proteomic datasets through distinctive proteins in cell clusters

The use of mass spectrometry and antibody‐based sequencing technologies at the single‐cell level has led to an increase in single‐cell proteomic datasets. Integrating these datasets is crucial to eliminate the batch effect that often arises due to their limited sequencing molecules. Although methods...

Full description

Saved in:
Bibliographic Details
Published in:Proteomics (Weinheim) 2024-04, Vol.24 (7), p.e2300282-n/a
Main Authors: Koca, Mehmet Burak, Sevilgen, Fatih Erdoğan
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The use of mass spectrometry and antibody‐based sequencing technologies at the single‐cell level has led to an increase in single‐cell proteomic datasets. Integrating these datasets is crucial to eliminate the batch effect that often arises due to their limited sequencing molecules. Although methods for horizontally integrating high‐dimensional single‐cell transcriptomic datasets can also be applied to single‐cell proteomic datasets, a specialized approach explicitly tailored for low‐dimensional proteomic datasets may enhance the integration process. Here, we introduce SCPRO‐HI, an algorithm for the horizontal integration of antibody‐based single‐cell proteomic datasets. It utilizes a hierarchical cell anchoring technique to match cells based on the similarity of distinctive proteins for constituting cell clusters. A novel variational auto‐encoder model is employed for correcting batch effects on the protein abundances, eliminating the need for mapping them into a new domain. Moreover, we propose a technique for extending the algorithm to high‐dimensional datasets. The performance of the SCPRO‐HI algorithm is evaluated using simulated and real‐world single‐cell proteomic datasets. The findings demonstrate our algorithm outperforms state‐of‐the‐art methods, achieving a 75% higher silhouette score while preserving HVPs 13% better. Furthermore, the algorithm shows competitive performance in transcriptomic datasets, suggesting potential for integrating high‐dimensional mass‐spectrometry‐based proteomic datasets.
ISSN:1615-9853
1615-9861
DOI:10.1002/pmic.202300282