Loading…

MicNeSs: genotyping microsatellite loci from a collection of (NGS) reads

Microsatellites are widely used in population genetics to uncover recent evolutionary events. They are typically genotyped using capillary sequencer, which capacity is usually limited to 9, at most 12 loci for each run, and which analysis is a tedious task that is performed by hand. With the rise of...

Full description

Saved in:
Bibliographic Details
Published in:Molecular ecology resources 2016-03, Vol.16 (2), p.524-533
Main Authors: Suez, Marie, Behdenna, Abdelkader, Brouillet, Sophie, Graça, Paula, Higuet, Dominique, Achaz, Guillaume
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c6117-728319d3d123a6215018d0d2b1e88f911ba05c87f50061b908f26c46c461e15c3
cites cdi_FETCH-LOGICAL-c6117-728319d3d123a6215018d0d2b1e88f911ba05c87f50061b908f26c46c461e15c3
container_end_page 533
container_issue 2
container_start_page 524
container_title Molecular ecology resources
container_volume 16
creator Suez, Marie
Behdenna, Abdelkader
Brouillet, Sophie
Graça, Paula
Higuet, Dominique
Achaz, Guillaume
description Microsatellites are widely used in population genetics to uncover recent evolutionary events. They are typically genotyped using capillary sequencer, which capacity is usually limited to 9, at most 12 loci for each run, and which analysis is a tedious task that is performed by hand. With the rise of next‐generation sequencing (NGS), a much larger number of loci and individuals are available from sequencing: for example, on a single run of a GS Junior, 28 loci from 96 individuals are sequenced with a 30X cover. We have developed an algorithm to automatically and efficiently genotype microsatellites from a collection of reads sorted by individual (e.g. specific PCR amplifications of a locus or a collection of reads that encompass a locus of interest). As the sequencing and the PCR amplification introduce artefactual insertions or deletions, the set of reads from a single microsatellite allele shows several length variants. The algorithm infers, without alignment, the true unknown allele(s) of each individual from the observed distributions of microsatellites length of all individuals. MicNeSs, a python implementation of the algorithm, can be used to genotype any microsatellite locus from any organism and has been tested on 454 pyrosequencing data of several loci from fruit flies (a model species) and red deers (a nonmodel species). Without any parallelization, it automatically genotypes 22 loci from 441 individuals in 11 hours on a standard computer. The comparison of MicNeSs inferences to the standard method shows an excellent agreement, with some differences illustrating the pros and cons of both methods.
doi_str_mv 10.1111/1755-0998.12467
format article
fullrecord <record><control><sourceid>proquest_hal_p</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_01544754v1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1761078183</sourcerecordid><originalsourceid>FETCH-LOGICAL-c6117-728319d3d123a6215018d0d2b1e88f911ba05c87f50061b908f26c46c461e15c3</originalsourceid><addsrcrecordid>eNqNkU1v1DAQhi0EoqVw5gaWuLSHtJ7EX-mtqsouYrsgbSt6s7yOs7gk8WJngf33OKTNgQtYtsYaP_Nqxi9Cr4GcQlpnIBjLSFnKU8gpF0_Q4ZR5Ot3l3QF6EeM9IZyUgj5HBzmnhAjgh2h-7czSruI53tjO9_ut6za4dSb4qHvbNK63uPHG4Tr4FmtsfNNY0zvfYV_j4-VsdYKD1VV8iZ7Vuon21UM8Qrfvr24u59ni0-zD5cUiMxxAZCKXBZRVUUFeaJ4DIyArUuVrsFLWJcBaE2akqFnqFtYlkXXODR02WGCmOEIno-5X3ahtcK0Oe-W1U_OLhRpyBBilgtEfkNjjkd0G_31nY69aF02aSnfW76ICIThnDKT8D5QDERJkkdB3f6H3fhe6NPRAkXQk44k6G6nhK2Ow9dQsEDV4pwZ31OCU-uNdqnjzoLtbt7aa-EezEsBG4Kdr7P5feur6avkonI11Lvb211SnwzeVXgVTX5YzVXye360kfFQ3iX878rX2Sm-Ci-p2lRPghABllIniN8PQtnY</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1760176856</pqid></control><display><type>article</type><title>MicNeSs: genotyping microsatellite loci from a collection of (NGS) reads</title><source>Wiley-Blackwell Read &amp; Publish Collection</source><creator>Suez, Marie ; Behdenna, Abdelkader ; Brouillet, Sophie ; Graça, Paula ; Higuet, Dominique ; Achaz, Guillaume</creator><creatorcontrib>Suez, Marie ; Behdenna, Abdelkader ; Brouillet, Sophie ; Graça, Paula ; Higuet, Dominique ; Achaz, Guillaume</creatorcontrib><description>Microsatellites are widely used in population genetics to uncover recent evolutionary events. They are typically genotyped using capillary sequencer, which capacity is usually limited to 9, at most 12 loci for each run, and which analysis is a tedious task that is performed by hand. With the rise of next‐generation sequencing (NGS), a much larger number of loci and individuals are available from sequencing: for example, on a single run of a GS Junior, 28 loci from 96 individuals are sequenced with a 30X cover. We have developed an algorithm to automatically and efficiently genotype microsatellites from a collection of reads sorted by individual (e.g. specific PCR amplifications of a locus or a collection of reads that encompass a locus of interest). As the sequencing and the PCR amplification introduce artefactual insertions or deletions, the set of reads from a single microsatellite allele shows several length variants. The algorithm infers, without alignment, the true unknown allele(s) of each individual from the observed distributions of microsatellites length of all individuals. MicNeSs, a python implementation of the algorithm, can be used to genotype any microsatellite locus from any organism and has been tested on 454 pyrosequencing data of several loci from fruit flies (a model species) and red deers (a nonmodel species). Without any parallelization, it automatically genotypes 22 loci from 441 individuals in 11 hours on a standard computer. The comparison of MicNeSs inferences to the standard method shows an excellent agreement, with some differences illustrating the pros and cons of both methods.</description><identifier>ISSN: 1755-098X</identifier><identifier>EISSN: 1755-0998</identifier><identifier>DOI: 10.1111/1755-0998.12467</identifier><identifier>PMID: 26400716</identifier><language>eng</language><publisher>England: Blackwell Pub</publisher><subject>Algorithms ; Biodiversity ; Computational Biology - methods ; Drosophila ; genotyping ; Genotyping Techniques - methods ; High-Throughput Nucleotide Sequencing ; Life Sciences ; microsatellite loci ; Microsatellite Repeats ; next-generation sequencing (NGS) ; Python</subject><ispartof>Molecular ecology resources, 2016-03, Vol.16 (2), p.524-533</ispartof><rights>2015 John Wiley &amp; Sons Ltd</rights><rights>2015 John Wiley &amp; Sons Ltd.</rights><rights>Copyright © 2016 John Wiley &amp; Sons Ltd</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c6117-728319d3d123a6215018d0d2b1e88f911ba05c87f50061b908f26c46c461e15c3</citedby><cites>FETCH-LOGICAL-c6117-728319d3d123a6215018d0d2b1e88f911ba05c87f50061b908f26c46c461e15c3</cites><orcidid>0000-0002-0845-4272 ; 0000-0003-4514-5935</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,780,784,885,27924,27925</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/26400716$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://hal.science/hal-01544754$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Suez, Marie</creatorcontrib><creatorcontrib>Behdenna, Abdelkader</creatorcontrib><creatorcontrib>Brouillet, Sophie</creatorcontrib><creatorcontrib>Graça, Paula</creatorcontrib><creatorcontrib>Higuet, Dominique</creatorcontrib><creatorcontrib>Achaz, Guillaume</creatorcontrib><title>MicNeSs: genotyping microsatellite loci from a collection of (NGS) reads</title><title>Molecular ecology resources</title><addtitle>Mol Ecol Resour</addtitle><description>Microsatellites are widely used in population genetics to uncover recent evolutionary events. They are typically genotyped using capillary sequencer, which capacity is usually limited to 9, at most 12 loci for each run, and which analysis is a tedious task that is performed by hand. With the rise of next‐generation sequencing (NGS), a much larger number of loci and individuals are available from sequencing: for example, on a single run of a GS Junior, 28 loci from 96 individuals are sequenced with a 30X cover. We have developed an algorithm to automatically and efficiently genotype microsatellites from a collection of reads sorted by individual (e.g. specific PCR amplifications of a locus or a collection of reads that encompass a locus of interest). As the sequencing and the PCR amplification introduce artefactual insertions or deletions, the set of reads from a single microsatellite allele shows several length variants. The algorithm infers, without alignment, the true unknown allele(s) of each individual from the observed distributions of microsatellites length of all individuals. MicNeSs, a python implementation of the algorithm, can be used to genotype any microsatellite locus from any organism and has been tested on 454 pyrosequencing data of several loci from fruit flies (a model species) and red deers (a nonmodel species). Without any parallelization, it automatically genotypes 22 loci from 441 individuals in 11 hours on a standard computer. The comparison of MicNeSs inferences to the standard method shows an excellent agreement, with some differences illustrating the pros and cons of both methods.</description><subject>Algorithms</subject><subject>Biodiversity</subject><subject>Computational Biology - methods</subject><subject>Drosophila</subject><subject>genotyping</subject><subject>Genotyping Techniques - methods</subject><subject>High-Throughput Nucleotide Sequencing</subject><subject>Life Sciences</subject><subject>microsatellite loci</subject><subject>Microsatellite Repeats</subject><subject>next-generation sequencing (NGS)</subject><subject>Python</subject><issn>1755-098X</issn><issn>1755-0998</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><recordid>eNqNkU1v1DAQhi0EoqVw5gaWuLSHtJ7EX-mtqsouYrsgbSt6s7yOs7gk8WJngf33OKTNgQtYtsYaP_Nqxi9Cr4GcQlpnIBjLSFnKU8gpF0_Q4ZR5Ot3l3QF6EeM9IZyUgj5HBzmnhAjgh2h-7czSruI53tjO9_ut6za4dSb4qHvbNK63uPHG4Tr4FmtsfNNY0zvfYV_j4-VsdYKD1VV8iZ7Vuon21UM8Qrfvr24u59ni0-zD5cUiMxxAZCKXBZRVUUFeaJ4DIyArUuVrsFLWJcBaE2akqFnqFtYlkXXODR02WGCmOEIno-5X3ahtcK0Oe-W1U_OLhRpyBBilgtEfkNjjkd0G_31nY69aF02aSnfW76ICIThnDKT8D5QDERJkkdB3f6H3fhe6NPRAkXQk44k6G6nhK2Ow9dQsEDV4pwZ31OCU-uNdqnjzoLtbt7aa-EezEsBG4Kdr7P5feur6avkonI11Lvb211SnwzeVXgVTX5YzVXye360kfFQ3iX878rX2Sm-Ci-p2lRPghABllIniN8PQtnY</recordid><startdate>201603</startdate><enddate>201603</enddate><creator>Suez, Marie</creator><creator>Behdenna, Abdelkader</creator><creator>Brouillet, Sophie</creator><creator>Graça, Paula</creator><creator>Higuet, Dominique</creator><creator>Achaz, Guillaume</creator><general>Blackwell Pub</general><general>Blackwell Publishing Ltd</general><general>Wiley Subscription Services, Inc</general><general>Wiley/Blackwell</general><scope>FBQ</scope><scope>BSCLL</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SN</scope><scope>7SS</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>M7N</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>1XC</scope><orcidid>https://orcid.org/0000-0002-0845-4272</orcidid><orcidid>https://orcid.org/0000-0003-4514-5935</orcidid></search><sort><creationdate>201603</creationdate><title>MicNeSs: genotyping microsatellite loci from a collection of (NGS) reads</title><author>Suez, Marie ; Behdenna, Abdelkader ; Brouillet, Sophie ; Graça, Paula ; Higuet, Dominique ; Achaz, Guillaume</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c6117-728319d3d123a6215018d0d2b1e88f911ba05c87f50061b908f26c46c461e15c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Algorithms</topic><topic>Biodiversity</topic><topic>Computational Biology - methods</topic><topic>Drosophila</topic><topic>genotyping</topic><topic>Genotyping Techniques - methods</topic><topic>High-Throughput Nucleotide Sequencing</topic><topic>Life Sciences</topic><topic>microsatellite loci</topic><topic>Microsatellite Repeats</topic><topic>next-generation sequencing (NGS)</topic><topic>Python</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Suez, Marie</creatorcontrib><creatorcontrib>Behdenna, Abdelkader</creatorcontrib><creatorcontrib>Brouillet, Sophie</creatorcontrib><creatorcontrib>Graça, Paula</creatorcontrib><creatorcontrib>Higuet, Dominique</creatorcontrib><creatorcontrib>Achaz, Guillaume</creatorcontrib><collection>AGRIS</collection><collection>Istex</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>Hyper Article en Ligne (HAL)</collection><jtitle>Molecular ecology resources</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Suez, Marie</au><au>Behdenna, Abdelkader</au><au>Brouillet, Sophie</au><au>Graça, Paula</au><au>Higuet, Dominique</au><au>Achaz, Guillaume</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MicNeSs: genotyping microsatellite loci from a collection of (NGS) reads</atitle><jtitle>Molecular ecology resources</jtitle><addtitle>Mol Ecol Resour</addtitle><date>2016-03</date><risdate>2016</risdate><volume>16</volume><issue>2</issue><spage>524</spage><epage>533</epage><pages>524-533</pages><issn>1755-098X</issn><eissn>1755-0998</eissn><abstract>Microsatellites are widely used in population genetics to uncover recent evolutionary events. They are typically genotyped using capillary sequencer, which capacity is usually limited to 9, at most 12 loci for each run, and which analysis is a tedious task that is performed by hand. With the rise of next‐generation sequencing (NGS), a much larger number of loci and individuals are available from sequencing: for example, on a single run of a GS Junior, 28 loci from 96 individuals are sequenced with a 30X cover. We have developed an algorithm to automatically and efficiently genotype microsatellites from a collection of reads sorted by individual (e.g. specific PCR amplifications of a locus or a collection of reads that encompass a locus of interest). As the sequencing and the PCR amplification introduce artefactual insertions or deletions, the set of reads from a single microsatellite allele shows several length variants. The algorithm infers, without alignment, the true unknown allele(s) of each individual from the observed distributions of microsatellites length of all individuals. MicNeSs, a python implementation of the algorithm, can be used to genotype any microsatellite locus from any organism and has been tested on 454 pyrosequencing data of several loci from fruit flies (a model species) and red deers (a nonmodel species). Without any parallelization, it automatically genotypes 22 loci from 441 individuals in 11 hours on a standard computer. The comparison of MicNeSs inferences to the standard method shows an excellent agreement, with some differences illustrating the pros and cons of both methods.</abstract><cop>England</cop><pub>Blackwell Pub</pub><pmid>26400716</pmid><doi>10.1111/1755-0998.12467</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-0845-4272</orcidid><orcidid>https://orcid.org/0000-0003-4514-5935</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1755-098X
ispartof Molecular ecology resources, 2016-03, Vol.16 (2), p.524-533
issn 1755-098X
1755-0998
language eng
recordid cdi_hal_primary_oai_HAL_hal_01544754v1
source Wiley-Blackwell Read & Publish Collection
subjects Algorithms
Biodiversity
Computational Biology - methods
Drosophila
genotyping
Genotyping Techniques - methods
High-Throughput Nucleotide Sequencing
Life Sciences
microsatellite loci
Microsatellite Repeats
next-generation sequencing (NGS)
Python
title MicNeSs: genotyping microsatellite loci from a collection of (NGS) reads
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T21%3A20%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_hal_p&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MicNeSs:%20genotyping%20microsatellite%20loci%20from%20a%20collection%20of%20(NGS)%20reads&rft.jtitle=Molecular%20ecology%20resources&rft.au=Suez,%20Marie&rft.date=2016-03&rft.volume=16&rft.issue=2&rft.spage=524&rft.epage=533&rft.pages=524-533&rft.issn=1755-098X&rft.eissn=1755-0998&rft_id=info:doi/10.1111/1755-0998.12467&rft_dat=%3Cproquest_hal_p%3E1761078183%3C/proquest_hal_p%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c6117-728319d3d123a6215018d0d2b1e88f911ba05c87f50061b908f26c46c461e15c3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1760176856&rft_id=info:pmid/26400716&rfr_iscdi=true