Loading…
Integration of single‐cell proteomic datasets through distinctive proteins in cell clusters
The use of mass spectrometry and antibody‐based sequencing technologies at the single‐cell level has led to an increase in single‐cell proteomic datasets. Integrating these datasets is crucial to eliminate the batch effect that often arises due to their limited sequencing molecules. Although methods...
Saved in:
Published in: | Proteomics (Weinheim) 2024-04, Vol.24 (7), p.e2300282-n/a |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The use of mass spectrometry and antibody‐based sequencing technologies at the single‐cell level has led to an increase in single‐cell proteomic datasets. Integrating these datasets is crucial to eliminate the batch effect that often arises due to their limited sequencing molecules. Although methods for horizontally integrating high‐dimensional single‐cell transcriptomic datasets can also be applied to single‐cell proteomic datasets, a specialized approach explicitly tailored for low‐dimensional proteomic datasets may enhance the integration process. Here, we introduce SCPRO‐HI, an algorithm for the horizontal integration of antibody‐based single‐cell proteomic datasets. It utilizes a hierarchical cell anchoring technique to match cells based on the similarity of distinctive proteins for constituting cell clusters. A novel variational auto‐encoder model is employed for correcting batch effects on the protein abundances, eliminating the need for mapping them into a new domain. Moreover, we propose a technique for extending the algorithm to high‐dimensional datasets. The performance of the SCPRO‐HI algorithm is evaluated using simulated and real‐world single‐cell proteomic datasets. The findings demonstrate our algorithm outperforms state‐of‐the‐art methods, achieving a 75% higher silhouette score while preserving HVPs 13% better. Furthermore, the algorithm shows competitive performance in transcriptomic datasets, suggesting potential for integrating high‐dimensional mass‐spectrometry‐based proteomic datasets. |
---|---|
ISSN: | 1615-9853 1615-9861 |
DOI: | 10.1002/pmic.202300282 |