Loading…

Improving Silkworm Genome Annotation Using a Proteogenomics Approach

The silkworm genome has been deeply sequenced and assembled, but accurate genome annotation, which is important for modern biological research, remains far from complete. To improve silkworm genome annotation, we carried out a proteogenomics analysis using 9.8 million mass spectra collected from dif...

Full description

Saved in:
Bibliographic Details
Published in:Journal of proteome research 2019-08, Vol.18 (8), p.3009-3019
Main Authors: Ye, Xiaogang, Tang, Xiaoli, Wang, Xiaoxiao, Che, Jiaqian, Wu, Meiyu, Liang, Jianshe, Ye, Lupeng, Qian, Qiujie, Li, Jianying, You, Zhengying, Zhang, Yuyu, Wang, Shaohua, Zhong, Boxiong
Format: Article
Language:English
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The silkworm genome has been deeply sequenced and assembled, but accurate genome annotation, which is important for modern biological research, remains far from complete. To improve silkworm genome annotation, we carried out a proteogenomics analysis using 9.8 million mass spectra collected from different tissues and developmental stages of the silkworm. The results confirmed the translational products of 4307 existing gene models and identified 1701 novel genome search-specific peptides (GSSPs). Using these GSSPs, 74 novel gene-coding sequences were identified, and 121 existing gene models were corrected. We also identified 1182 novel junction peptides based on an exon-skipping database that resulted in the identification of 973 alternative splicing sites. Furthermore, we performed RNA-seq analysis to improve silkworm genome annotation at the transcriptional level. A total of 1704 new transcripts and 1136 new exons were identified, 2581 untranslated regions (UTRs) were revised, and 1301 alternative splicing (AS) genes were identified. The transcriptomics results were integrated with the proteomics data to further complement and verify the new annotations. In addition, 14 incorrect genes and 10 skipped exons were verified using the two analysis methods. Altogether, we identified 1838 new transcripts and 1593 AS genes and revised 5074 existing genes using proteogenomics and transcriptome analyses. Data are available via ProteomeXchange with identifier PXD009672. The large-scale proteogenomics and transcriptome analyses in this study will greatly improve silkworm genome annotation and contribute to future studies.
ISSN:1535-3893
1535-3907
DOI:10.1021/acs.jproteome.8b00965