Loading…

RNASeq_similarity_matrix: visually identify sample mix-ups in RNASeq data using a ‘genomic’ sequence similarity matrix

Abstract Summary Mistakes in linking a patient’s biological samples with their phenotype data can confound RNA-Seq studies. The current method for avoiding such sample mix-ups is to test for inconsistencies between biological data and known phenotype data such as sex. However, in DNA studies a commo...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics 2020-03, Vol.36 (6), p.1940-1941
Main Authors: Kist, Nicolaas C, Power, Robert A, Skelton, Andrew, Seegobin, Seth D, Verbelen, Moira, Bonde, Bushan, Malki, Karim
Format: Article
Language:English
Citations: Items that this one cites
Items that cite this one
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Summary Mistakes in linking a patient’s biological samples with their phenotype data can confound RNA-Seq studies. The current method for avoiding such sample mix-ups is to test for inconsistencies between biological data and known phenotype data such as sex. However, in DNA studies a common QC step is to check for unexpected relatedness between samples. Here, we extend this method to RNA-Seq, which allows the detection of duplicated samples without relying on identifying inconsistencies with phenotype data. Results We present RNASeq_similarity_matrix: an automated tool to generate a sequence similarity matrix from RNA-Seq data, which can be used to visually identify sample mix-ups. This is particularly useful when a study contains multiple samples from the same individual, but can also detect contamination in studies with only one sample per individual. Availability and implementation RNASeq_similarity_matrix has been made available as a documented GPL licensed Docker image on www.github.com/nicokist/RNASeq_similarity_matrix.
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/btz821