Loading…

Automatic identification of relevant genes from low-dimensional embeddings of single-cell RNA-seq data

Abstract Motivation Dimensionality reduction is a key step in the analysis of single-cell RNA-sequencing data. It produces a low-dimensional embedding for visualization and as a calculation base for downstream analysis. Nonlinear techniques are most suitable to handle the intrinsic complexity of lar...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics 2020-08, Vol.36 (15), p.4291-4295
Main Authors: Angerer, Philipp, Fischer, David S, Theis, Fabian J, Scialdone, Antonio, Marr, Carsten
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Motivation Dimensionality reduction is a key step in the analysis of single-cell RNA-sequencing data. It produces a low-dimensional embedding for visualization and as a calculation base for downstream analysis. Nonlinear techniques are most suitable to handle the intrinsic complexity of large, heterogeneous single-cell data. However, with no linear relation between gene and embedding coordinate, there is no way to extract the identity of genes driving any cell’s position in the low-dimensional embedding, making it difficult to characterize the underlying biological processes. Results In this article, we introduce the concepts of local and global gene relevance to compute an equivalent of principal component analysis loadings for non-linear low-dimensional embeddings. Global gene relevance identifies drivers of the overall embedding, while local gene relevance identifies those of a defined sub-region. We apply our method to single-cell RNA-seq datasets from different experimental protocols and to different low-dimensional embedding techniques. This shows our method’s versatility to identify key genes for a variety of biological processes. Availability and implementation To ensure reproducibility and ease of use, our method is released as part of destiny 3.0, a popular R package for building diffusion maps from single-cell transcriptomic data. It is readily available through Bioconductor. Supplementary information Supplementary data are available at Bioinformatics online.
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/btaa198