Loading…

Discovery of a Novel Intron in US10/US11/US12 of HSV-1 Strain 17

Herpes Simplex Virus type 1 (HSV-1) infects humans and causes a variety of clinical manifestations. Many HSV-1 genomes have been sequenced with high-throughput sequencing technologies and the annotation of these genome sequences heavily relies on the known genes in reference strains. Consequently, t...

Full description

Saved in:
Bibliographic Details
Published in:Viruses 2023-10, Vol.15 (11), p.2144
Main Authors: Chang, Weizhong, Hao, Ming, Qiu, Ju, Sherman, Brad T, Imamichi, Tomozumi
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Herpes Simplex Virus type 1 (HSV-1) infects humans and causes a variety of clinical manifestations. Many HSV-1 genomes have been sequenced with high-throughput sequencing technologies and the annotation of these genome sequences heavily relies on the known genes in reference strains. Consequently, the accuracy of reference strain annotation is critical for future research and treatment of HSV-1 infection. In this study, we analyzed RNA-Seq data of HSV-1 from NCBI databases and discovered a novel intron in the overlapping coding sequence (CDS) of and , and the 3' UTR of in strain 17, a commonly used HSV-1 reference strain. To comprehensively understand the shared / / intron structure, we used as a representative and surveyed all gene sequences from the NCBI nt/nr database. A total of 193 high-quality sequences were obtained, of which 186 sequences have a domain of uninterrupted tandemly repeated RXP (Arg-X-Pro) in the C-terminus half of the protein. In total, 97 of the 186 sequences encode US11 protein with the same length of the mature US11 in strain 17:26 of them have the same structure of and can be spliced as in strain 17; 71 of them have transcripts that are the same as mature mRNA in strain 17. In total, 76 gene sequences have either canonical or known noncanonical intron border sequences and may be spliced like strain 17 and obtain mature CDS with the same length. If not spliced, they will have extra RXP repeats. A tandemly repeated RXP domain was proposed to be essential for US11 to bind with RNA and other host factors. US10 protein sequences from the same strains have also been studied. The results of this study show that even a frequently used reference organism may have errors in widely used databases. This study provides accurate annotation of the , , and gene structure, which will build a more solid foundation to study expression regulation of the function of these genes.
ISSN:1999-4915
1999-4915
DOI:10.3390/v15112144