Loading…

ABEILLE: a novel method for ABerrant Expression Identification empLoying machine LEarning from RNA-sequencing data

MOTIVATIONCurrent advances in omics technologies are paving the diagnosis of rare diseases proposing a complementary assay to identify the responsible gene. The use of transcriptomic data to identify aberrant gene expression (AGE) has demonstrated to yield potential pathogenic events. However, popul...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics (Oxford, England) England), 2022-10, Vol.38 (20), p.4754-4761
Main Authors: Labory, Justine, Le Bideau, Gwendal, Pratella, David, Yao, Jean-Elisée, Ait-El-Mkadem Saadi, Samira, Bannwarth, Sylvie, El-Hami, Loubna, Paquis-Fluckinger, Véronique, Bottini, Silvia
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:MOTIVATIONCurrent advances in omics technologies are paving the diagnosis of rare diseases proposing a complementary assay to identify the responsible gene. The use of transcriptomic data to identify aberrant gene expression (AGE) has demonstrated to yield potential pathogenic events. However, popular approaches for AGE identification are limited by the use of statistical tests that imply the choice of arbitrary cut-off for significance assessment and the availability of several replicates not always possible in clinical contexts. RESULTSHence, we developed ABerrant Expression Identification empLoying machine LEarning from sequencing data (ABEILLE) a variational autoencoder (VAE)-based method for the identification of AGEs from the analysis of RNA-seq data without the need for replicates or a control group. ABEILLE combines the use of a VAE, able to model any data without specific assumptions on their distribution, and a decision tree to classify genes as AGE or non-AGE. An anomaly score is associated with each gene in order to stratify AGE by the severity of aberration. We tested ABEILLE on a semi-synthetic and an experimental dataset demonstrating the importance of the flexibility of the VAE configuration to identify potential pathogenic candidates. AVAILABILITY AND IMPLEMENTATIONABEILLE source code is freely available at: https://github.com/UCA-MSI/ABEILLE. SUPPLEMENTARY INFORMATIONSupplementary data are available at Bioinformatics online.
ISSN:1367-4803
1367-4811
DOI:10.1093/bioinformatics/btac603