Loading…
RNASeq_similarity_matrix: visually identify sample mix-ups in RNASeq data using a ‘genomic’ sequence similarity matrix
Abstract Summary Mistakes in linking a patient’s biological samples with their phenotype data can confound RNA-Seq studies. The current method for avoiding such sample mix-ups is to test for inconsistencies between biological data and known phenotype data such as sex. However, in DNA studies a commo...
Saved in:
Published in: | Bioinformatics 2020-03, Vol.36 (6), p.1940-1941 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Abstract
Summary
Mistakes in linking a patient’s biological samples with their phenotype data can confound RNA-Seq studies. The current method for avoiding such sample mix-ups is to test for inconsistencies between biological data and known phenotype data such as sex. However, in DNA studies a common QC step is to check for unexpected relatedness between samples. Here, we extend this method to RNA-Seq, which allows the detection of duplicated samples without relying on identifying inconsistencies with phenotype data.
Results
We present RNASeq_similarity_matrix: an automated tool to generate a sequence similarity matrix from RNA-Seq data, which can be used to visually identify sample mix-ups. This is particularly useful when a study contains multiple samples from the same individual, but can also detect contamination in studies with only one sample per individual.
Availability and implementation
RNASeq_similarity_matrix has been made available as a documented GPL licensed Docker image on www.github.com/nicokist/RNASeq_similarity_matrix. |
---|---|
ISSN: | 1367-4803 1460-2059 1367-4811 |
DOI: | 10.1093/bioinformatics/btz821 |