Loading…

Establishment of reference standards for multifaceted mosaic variant analysis

Detection of somatic mosaicism in non-proliferative cells is a new challenge in genome research, however, the accuracy of current detection strategies remains uncertain due to the lack of a ground truth. Herein, we sought to present a set of ultra-deep sequenced WES data based on reference standards...

Full description

Saved in:
Bibliographic Details
Published in:Scientific data 2022-02, Vol.9 (1), p.35-35, Article 35
Main Authors: Ha, Yoo-Jin, Oh, Myung Joon, Kim, Junhan, Kim, Jisoo, Kang, Seungseok, Minna, John D., Kim, Hyun Seok, Kim, Sangwoo
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Detection of somatic mosaicism in non-proliferative cells is a new challenge in genome research, however, the accuracy of current detection strategies remains uncertain due to the lack of a ground truth. Herein, we sought to present a set of ultra-deep sequenced WES data based on reference standards generated by cell line mixtures, providing a total of 386,613 mosaic single-nucleotide variants (SNVs) and insertion-deletion mutations (INDELs) with variant allele frequencies (VAFs) ranging from 0.5% to 56%, as well as 35,113,417 non-variant and 19,936 germline variant sites as a negative control. The whole reference standard set mimics the cumulative aspect of mosaic variant acquisition such as in the early developmental stage owing to the progressive mixing of cell lines with established genotypes, ultimately unveiling 741 possible inter-sample relationships with respect to variant sharing and asymmetry in VAFs. We expect that our reference data will be essential for optimizing the current use of mosaic variant detection strategies and for developing algorithms to enable future improvements. Measurement(s) genotype Technology Type(s) DNA sequencing Factor Type(s) genotyping Sample Characteristic - Organism Homo sapiens Sample Characteristic - Environment cell line Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.16970041
ISSN:2052-4463
2052-4463
DOI:10.1038/s41597-022-01133-8