Loading…

De novo genome assembly of Candida glabrata reveals cell wall protein complement and structure of dispersed tandem repeat arrays

Candida glabratais an opportunistic pathogen in humans, responsible for approximately 20% of disseminated candidiasis. Candida glabrata's ability to adhere to host tissue is mediated by GPI‐anchored cell wall proteins (GPI‐CWPs); the corresponding genes contain long tandem repeat regions. These...

Full description

Saved in:
Bibliographic Details
Published in:Molecular microbiology 2020-06, Vol.113 (6), p.1209-1224
Main Authors: Xu, Zhuwei, Green, Brian, Benoit, Nicole, Schatz, Michael, Wheelan, Sarah, Cormack, Brendan
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Candida glabratais an opportunistic pathogen in humans, responsible for approximately 20% of disseminated candidiasis. Candida glabrata's ability to adhere to host tissue is mediated by GPI‐anchored cell wall proteins (GPI‐CWPs); the corresponding genes contain long tandem repeat regions. These repeat regions resulted in assembly errors in the reference genome. Here, we performed a de novo assembly of the C. glabrata type strain CBS138 using long single‐molecule real‐time reads, with short read sequences (Illumina) for refinement, and constructed telomere‐to‐telomere assemblies of all 13 chromosomes. Our assembly has excellent agreement overall with the current reference genome, but we made substantial corrections within tandem repeat regions. Specifically, we removed 62 genes of which 45 were scrambled due to misassembly in the reference. We annotated 31 novel ORFs of which 24 ORFs are GPI‐CWPs. In addition, we corrected the tandem repeat structure of an additional 21 genes. Our corrections to the genome were substantial, with the length of new genes and tandem repeat corrections amounting to approximately 3.8% of the ORFeome length. As most corrections were within the coding regions of GPI‐CWP genes, our genome assembly establishes a high‐quality reference set of genes and repeat structures for the functional analysis of these cell surface proteins. We present a de novo assembly of the Candida glabrata genome. This assembly corrects the large scale systematic misassemblies of tandem repeat sequences found in the coding regions of cell wall protein genes. We document the correct number and structure of cell wall protein genes in this opportunistic pathogen.
ISSN:0950-382X
1365-2958
DOI:10.1111/mmi.14488