Loading…
Automated pseudogene detection reveals insights into historical gene sharing dynamics in prokaryotes
In recent years it has become apparent that prokaryotic genomes contain large numbers of pseudogenised genes which may provide valuable insights into the recent functional history of an organism. However, pseudogenes are difficult to detectab initioand are not routinely reported by gene prediction t...
Saved in:
Published in: | Access microbiology 2022-05, Vol.4 (5) |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | 5 |
container_start_page | |
container_title | Access microbiology |
container_volume | 4 |
creator | Dimonaco, Nicholas J Aubrey, Wayne Clare, Amanda Kenobi, Kim Creevey, Chris |
description | In recent years it has become apparent that prokaryotic genomes contain large numbers of pseudogenised genes which may provide valuable insights into the recent functional history of an organism. However, pseudogenes are difficult to detectab initioand are not routinely reported by gene prediction tools.
We present StORF-R(Stop-ORF-Reporter), a tool that takes as input an annotated genome and returns putative missed genes (functional and/or pseudogenised) from the intergenic regions. We show that this methodology can recover gene-families that the state-of-the-art methods continue to misreport or completely omit.
We applied StORF-R to the intergenic regions of2,665E. coligenomes and found on average 244 previously missed pseudogenised genes (with in-frame stop codons) per genome, many of which had high scoring similarity to known Swiss-Prot proteins. Many of these pseudogenised genes form widespread gene families across E. coli strains.
To investigate if this phenomenon exists in other taxa we further applied the methodology to 44,048 bacterial genomes representing 8,244 species from Ensembl. This revealed manygene-families spanning multiple species with large (>10,000) numbers of copies of both intact and pseudogenised versions. Many of these families had only previously been reported in a single or few genomes, though we detected many hundred pseudogenised versions with StORF-R, changing our understanding of how widespread these genes truly are.
These pseudogenised genes represent a pangenomic ‘graveyard’ which may alter our understanding of the definition of core and accessory genes for many species. |
doi_str_mv | 10.1099/acmi.ac2021.po0147 |
format | article |
fullrecord | <record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_1099_acmi_ac2021_po0147</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1099_acmi_ac2021_po0147</sourcerecordid><originalsourceid>FETCH-LOGICAL-c877-5433bf7480ddb63d60fd870a2dbfe62a2c46fed6db2933a0fbbeabb175e0bf423</originalsourceid><addsrcrecordid>eNpNkMtqwzAUREVpoSHND3SlH3B6JTmyvQyhLwh0k73R48pWG1tGUgr5-8ZNF13NLIbDcAh5ZLBm0DRPygx-rQwHztZTAFZWN2TBN0wWNW_g9l-_J6uUPgGA80YywRfEbk85DCqjpVPCkw0djkgtZjTZh5FG_EZ1TNSPyXd9nksOtPcph-iNOtLffepV9GNH7XlUgzfzik4xfKl4DhnTA7lzFwiu_nJJDi_Ph91bsf94fd9t94Wpq6rYlEJoV5U1WKulsBKcrStQ3GqHkituSunQSqt5I4QCpzUqrVm1QdCu5GJJ-BVrYkgpomun6IfLh5ZBO5tqZ1Pt1VR7NSV-ANxcYjY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Automated pseudogene detection reveals insights into historical gene sharing dynamics in prokaryotes</title><source>PubMed Central Free</source><creator>Dimonaco, Nicholas J ; Aubrey, Wayne ; Clare, Amanda ; Kenobi, Kim ; Creevey, Chris</creator><creatorcontrib>Dimonaco, Nicholas J ; Aubrey, Wayne ; Clare, Amanda ; Kenobi, Kim ; Creevey, Chris</creatorcontrib><description>In recent years it has become apparent that prokaryotic genomes contain large numbers of pseudogenised genes which may provide valuable insights into the recent functional history of an organism. However, pseudogenes are difficult to detectab initioand are not routinely reported by gene prediction tools.
We present StORF-R(Stop-ORF-Reporter), a tool that takes as input an annotated genome and returns putative missed genes (functional and/or pseudogenised) from the intergenic regions. We show that this methodology can recover gene-families that the state-of-the-art methods continue to misreport or completely omit.
We applied StORF-R to the intergenic regions of2,665E. coligenomes and found on average 244 previously missed pseudogenised genes (with in-frame stop codons) per genome, many of which had high scoring similarity to known Swiss-Prot proteins. Many of these pseudogenised genes form widespread gene families across E. coli strains.
To investigate if this phenomenon exists in other taxa we further applied the methodology to 44,048 bacterial genomes representing 8,244 species from Ensembl. This revealed manygene-families spanning multiple species with large (>10,000) numbers of copies of both intact and pseudogenised versions. Many of these families had only previously been reported in a single or few genomes, though we detected many hundred pseudogenised versions with StORF-R, changing our understanding of how widespread these genes truly are.
These pseudogenised genes represent a pangenomic ‘graveyard’ which may alter our understanding of the definition of core and accessory genes for many species.</description><identifier>ISSN: 2516-8290</identifier><identifier>EISSN: 2516-8290</identifier><identifier>DOI: 10.1099/acmi.ac2021.po0147</identifier><language>eng</language><ispartof>Access microbiology, 2022-05, Vol.4 (5)</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Dimonaco, Nicholas J</creatorcontrib><creatorcontrib>Aubrey, Wayne</creatorcontrib><creatorcontrib>Clare, Amanda</creatorcontrib><creatorcontrib>Kenobi, Kim</creatorcontrib><creatorcontrib>Creevey, Chris</creatorcontrib><title>Automated pseudogene detection reveals insights into historical gene sharing dynamics in prokaryotes</title><title>Access microbiology</title><description>In recent years it has become apparent that prokaryotic genomes contain large numbers of pseudogenised genes which may provide valuable insights into the recent functional history of an organism. However, pseudogenes are difficult to detectab initioand are not routinely reported by gene prediction tools.
We present StORF-R(Stop-ORF-Reporter), a tool that takes as input an annotated genome and returns putative missed genes (functional and/or pseudogenised) from the intergenic regions. We show that this methodology can recover gene-families that the state-of-the-art methods continue to misreport or completely omit.
We applied StORF-R to the intergenic regions of2,665E. coligenomes and found on average 244 previously missed pseudogenised genes (with in-frame stop codons) per genome, many of which had high scoring similarity to known Swiss-Prot proteins. Many of these pseudogenised genes form widespread gene families across E. coli strains.
To investigate if this phenomenon exists in other taxa we further applied the methodology to 44,048 bacterial genomes representing 8,244 species from Ensembl. This revealed manygene-families spanning multiple species with large (>10,000) numbers of copies of both intact and pseudogenised versions. Many of these families had only previously been reported in a single or few genomes, though we detected many hundred pseudogenised versions with StORF-R, changing our understanding of how widespread these genes truly are.
These pseudogenised genes represent a pangenomic ‘graveyard’ which may alter our understanding of the definition of core and accessory genes for many species.</description><issn>2516-8290</issn><issn>2516-8290</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNpNkMtqwzAUREVpoSHND3SlH3B6JTmyvQyhLwh0k73R48pWG1tGUgr5-8ZNF13NLIbDcAh5ZLBm0DRPygx-rQwHztZTAFZWN2TBN0wWNW_g9l-_J6uUPgGA80YywRfEbk85DCqjpVPCkw0djkgtZjTZh5FG_EZ1TNSPyXd9nksOtPcph-iNOtLffepV9GNH7XlUgzfzik4xfKl4DhnTA7lzFwiu_nJJDi_Ph91bsf94fd9t94Wpq6rYlEJoV5U1WKulsBKcrStQ3GqHkituSunQSqt5I4QCpzUqrVm1QdCu5GJJ-BVrYkgpomun6IfLh5ZBO5tqZ1Pt1VR7NSV-ANxcYjY</recordid><startdate>20220518</startdate><enddate>20220518</enddate><creator>Dimonaco, Nicholas J</creator><creator>Aubrey, Wayne</creator><creator>Clare, Amanda</creator><creator>Kenobi, Kim</creator><creator>Creevey, Chris</creator><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20220518</creationdate><title>Automated pseudogene detection reveals insights into historical gene sharing dynamics in prokaryotes</title><author>Dimonaco, Nicholas J ; Aubrey, Wayne ; Clare, Amanda ; Kenobi, Kim ; Creevey, Chris</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c877-5433bf7480ddb63d60fd870a2dbfe62a2c46fed6db2933a0fbbeabb175e0bf423</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Dimonaco, Nicholas J</creatorcontrib><creatorcontrib>Aubrey, Wayne</creatorcontrib><creatorcontrib>Clare, Amanda</creatorcontrib><creatorcontrib>Kenobi, Kim</creatorcontrib><creatorcontrib>Creevey, Chris</creatorcontrib><collection>CrossRef</collection><jtitle>Access microbiology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Dimonaco, Nicholas J</au><au>Aubrey, Wayne</au><au>Clare, Amanda</au><au>Kenobi, Kim</au><au>Creevey, Chris</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Automated pseudogene detection reveals insights into historical gene sharing dynamics in prokaryotes</atitle><jtitle>Access microbiology</jtitle><date>2022-05-18</date><risdate>2022</risdate><volume>4</volume><issue>5</issue><issn>2516-8290</issn><eissn>2516-8290</eissn><abstract>In recent years it has become apparent that prokaryotic genomes contain large numbers of pseudogenised genes which may provide valuable insights into the recent functional history of an organism. However, pseudogenes are difficult to detectab initioand are not routinely reported by gene prediction tools.
We present StORF-R(Stop-ORF-Reporter), a tool that takes as input an annotated genome and returns putative missed genes (functional and/or pseudogenised) from the intergenic regions. We show that this methodology can recover gene-families that the state-of-the-art methods continue to misreport or completely omit.
We applied StORF-R to the intergenic regions of2,665E. coligenomes and found on average 244 previously missed pseudogenised genes (with in-frame stop codons) per genome, many of which had high scoring similarity to known Swiss-Prot proteins. Many of these pseudogenised genes form widespread gene families across E. coli strains.
To investigate if this phenomenon exists in other taxa we further applied the methodology to 44,048 bacterial genomes representing 8,244 species from Ensembl. This revealed manygene-families spanning multiple species with large (>10,000) numbers of copies of both intact and pseudogenised versions. Many of these families had only previously been reported in a single or few genomes, though we detected many hundred pseudogenised versions with StORF-R, changing our understanding of how widespread these genes truly are.
These pseudogenised genes represent a pangenomic ‘graveyard’ which may alter our understanding of the definition of core and accessory genes for many species.</abstract><doi>10.1099/acmi.ac2021.po0147</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2516-8290 |
ispartof | Access microbiology, 2022-05, Vol.4 (5) |
issn | 2516-8290 2516-8290 |
language | eng |
recordid | cdi_crossref_primary_10_1099_acmi_ac2021_po0147 |
source | PubMed Central Free |
title | Automated pseudogene detection reveals insights into historical gene sharing dynamics in prokaryotes |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T20%3A19%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automated%20pseudogene%20detection%20reveals%20insights%20into%20historical%20gene%20sharing%20dynamics%20in%20prokaryotes&rft.jtitle=Access%20microbiology&rft.au=Dimonaco,%20Nicholas%20J&rft.date=2022-05-18&rft.volume=4&rft.issue=5&rft.issn=2516-8290&rft.eissn=2516-8290&rft_id=info:doi/10.1099/acmi.ac2021.po0147&rft_dat=%3Ccrossref%3E10_1099_acmi_ac2021_po0147%3C/crossref%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c877-5433bf7480ddb63d60fd870a2dbfe62a2c46fed6db2933a0fbbeabb175e0bf423%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |