Loading…
Discovery of a Novel Intron in US10/US11/US12 of HSV-1 Strain 17
Herpes Simplex Virus type 1 (HSV-1) infects humans and causes a variety of clinical manifestations. Many HSV-1 genomes have been sequenced with high-throughput sequencing technologies and the annotation of these genome sequences heavily relies on the known genes in reference strains. Consequently, t...
Saved in:
Published in: | Viruses 2023-10, Vol.15 (11), p.2144 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Herpes Simplex Virus type 1 (HSV-1) infects humans and causes a variety of clinical manifestations. Many HSV-1 genomes have been sequenced with high-throughput sequencing technologies and the annotation of these genome sequences heavily relies on the known genes in reference strains. Consequently, the accuracy of reference strain annotation is critical for future research and treatment of HSV-1 infection. In this study, we analyzed RNA-Seq data of HSV-1 from NCBI databases and discovered a novel intron in the overlapping coding sequence (CDS) of
and
, and the 3' UTR of
in strain 17, a commonly used HSV-1 reference strain. To comprehensively understand the shared
/
/
intron structure, we used
as a representative and surveyed all
gene sequences from the NCBI nt/nr database. A total of 193 high-quality
sequences were obtained, of which 186 sequences have a domain of uninterrupted tandemly repeated RXP (Arg-X-Pro) in the C-terminus half of the protein. In total, 97 of the 186 sequences encode US11 protein with the same length of the mature US11 in strain 17:26 of them have the same structure of
and can be spliced as in strain 17; 71 of them have transcripts that are the same as mature
mRNA in strain 17. In total, 76
gene sequences have either canonical or known noncanonical intron border sequences and may be spliced like strain 17 and obtain mature
CDS with the same length. If not spliced, they will have extra RXP repeats. A tandemly repeated RXP domain was proposed to be essential for US11 to bind with RNA and other host factors. US10 protein sequences from the same strains have also been studied. The results of this study show that even a frequently used reference organism may have errors in widely used databases. This study provides accurate annotation of the
,
, and
gene structure, which will build a more solid foundation to study expression regulation of the function of these genes. |
---|---|
ISSN: | 1999-4915 1999-4915 |
DOI: | 10.3390/v15112144 |