Loading…

Single-cell assignment using multiple-adversarial domain adaptation network with large-scale references

The rapid accumulation of single-cell RNA-seq data has provided rich resources to characterize various human cell populations. However, achieving accurate cell-type annotation using public references presents challenges due to inconsistent annotations, batch effects, and rare cell types. Here, we in...

Full description

Saved in:
Bibliographic Details
Published in:Cell reports methods 2023-09, Vol.3 (9), p.100577, Article 100577
Main Authors: Ren, Pengfei, Shi, Xiaoying, Yu, Zhiguang, Dong, Xin, Ding, Xuanxin, Wang, Jin, Sun, Liangdong, Yan, Yilv, Hu, Junjie, Zhang, Peng, Chen, Qianming, Zhang, Jing, Li, Taiwen, Wang, Chenfei
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The rapid accumulation of single-cell RNA-seq data has provided rich resources to characterize various human cell populations. However, achieving accurate cell-type annotation using public references presents challenges due to inconsistent annotations, batch effects, and rare cell types. Here, we introduce SELINA (single-cell identity navigator), an integrative and automatic cell-type annotation framework based on a pre-curated reference atlas spanning various tissues. SELINA employs a multiple-adversarial domain adaptation network to remove batch effects within the reference dataset. Additionally, it enhances the annotation of less frequent cell types by synthetic minority oversampling and fits query data with the reference data using an autoencoder. SELINA culminates in the creation of a comprehensive and uniform reference atlas, encompassing 1.7 million cells covering 230 distinct human cell types. We substantiate its robustness and superiority across a multitude of human tissues. Notably, SELINA could accurately annotate cells within diverse disease contexts. SELINA provides a complete solution for human single-cell RNA-seq data annotation with both python and R packages. [Display omitted] •SELINA combines SMOTE, MADA, and an autoencoder to improve annotation accuracy•SELINA pre-builds a reference atlas with 1.7 million cells covering 230 human cell types•SELINA annotates cell types with high accuracy in various disease scenarios Cell-type annotation is a crucial step for interpreting cell-type functions in scRNA-seq data processing. There are two main methods for cell-type annotation, marker based and reference based. Reference-based methods transfer cell-type labels from reference datasets to query datasets using machine learning techniques, resulting in improved accuracy and broader applications. However, challenges remain, including difficulty in leveraging large-scale public data, cell number imbalances, batch effects, and reliance on reference data quality. Addressing these challenges is essential to improve the accuracy of cell-type annotation and enable the full potential of scRNA-seq data. Ren et al. develop SELINA (single-cell identity navigator), an integrative and automatic cell-type annotation framework based on a multiple-adversarial domain adaptation network and a pre-curated reference atlas of various tissues. SELENA enables accurate and robust cell-type annotation.
ISSN:2667-2375
2667-2375
DOI:10.1016/j.crmeth.2023.100577