Loading…

Genomic breeding values, SNP effects and gene identification for disease traits in cow training sets

Summary Holstein Friesian cow training sets were created according to disease incidences. The different datasets were used to investigate the impact of random forest (RF) and genomic BLUP (GBLUP) methodology on genomic prediction accuracies. In addition, for further verifications of some specific sc...

Full description

Saved in:
Bibliographic Details
Published in:Animal genetics 2018-06, Vol.49 (3), p.178-192
Main Authors: Naderi, S., Bohlouli, M., Yin, T., König, S.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c3531-8aba6ce51ad6c081b5a439b8d8de69a21b39b6dc7a7b8dafecd67944c5406a0f3
cites cdi_FETCH-LOGICAL-c3531-8aba6ce51ad6c081b5a439b8d8de69a21b39b6dc7a7b8dafecd67944c5406a0f3
container_end_page 192
container_issue 3
container_start_page 178
container_title Animal genetics
container_volume 49
creator Naderi, S.
Bohlouli, M.
Yin, T.
König, S.
description Summary Holstein Friesian cow training sets were created according to disease incidences. The different datasets were used to investigate the impact of random forest (RF) and genomic BLUP (GBLUP) methodology on genomic prediction accuracies. In addition, for further verifications of some specific scenarios, single‐step genomic BLUP was applied. Disease traits included the overall trait categories of (i) claw disorders, (ii) clinical mastitis and (iii) infertility from 80 741 first lactation Holstein cows kept in 58 large‐scale herds. A subset of 6744 cows was genotyped (50K SNP panel). Response variables for all scenarios were de‐regressed proofs (DRPs) and pre‐corrected phenotypes (PCPs). Initially, all sick cows were allocated to the testing set, and healthy cows represented the training set. For the ongoing cow allocation schemes, the number of sick cows in the training set increased stepwise by moving 10% of the sick cows from the testing to the training set in each step. The size of training and testing sets was kept constant by replacing the same number of cows in the testing set with (randomly selected) healthy cows from the training set. For both the RF and GBLUP methods, prediction accuracies were larger for DRPs compared to PCPs. For PCPs as a response variable, the largest prediction accuracies were observed when the disease incidences in training sets reflected the disease incidence in the whole population. A further increase in prediction accuracies for some selected cow allocation schemes (i.e. larger prediction accuracies compared to corresponding scenarios with RF or GBLUB) was achieved via single‐step GBLUP applications. Correlations between genome‐wide association study SNP effects and RF importance criteria for single SNPs were in a moderate range, from 0.42 to 0.57, when considering SNPs from all chromosomes or from specific chromosome segments. RF identified significant SNPs close to potential positional candidate genes: GAS1, GPAT3 and CYP2R1 for clinical mastitis; SPINK5 and SLC26A2 for laminitis; and FGF12 for endometritis.
doi_str_mv 10.1111/age.12661
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2022982353</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2047364419</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3531-8aba6ce51ad6c081b5a439b8d8de69a21b39b6dc7a7b8dafecd67944c5406a0f3</originalsourceid><addsrcrecordid>eNp1kNFKwzAUhoMobk4vfAEJeKNgZ5KmaXspMqcwVFCvy2lyOiJdqkmr-PbGTb0QzE044ePLf35CDjmb8njOYYlTLpTiW2TMU5UlgmVim4yZUEVScqlGZC-EZ8ZYwXO-S0aiVELmLBsTM0fXraymtUc01i3pG7QDhjP6cHtPsWlQ94GCM3SJDqk16HrbWA297RxtOk-NDQgBae_BRtQ6qrv39eS-dAH7sE92GmgDHnzfE_J0NXu8vE4Wd_Oby4tFotMs5UkBNSiNGQejdIxaZyDTsi5MYVCVIHgdJ2V0Dnl8hBjNqLyUUmeSKWBNOiEnG--L717jEn21skFj24LDbgiVYEKUhYifRfT4D_rcDd7FdJGSeaqk5GWkTjeU9l0IHpvqxdsV-I-Ks-qr-ipWX62rj-zRt3GoV2h-yZ-uI3C-Ad5tix__m6qL-Wyj_ARM941I</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2047364419</pqid></control><display><type>article</type><title>Genomic breeding values, SNP effects and gene identification for disease traits in cow training sets</title><source>Wiley</source><creator>Naderi, S. ; Bohlouli, M. ; Yin, T. ; König, S.</creator><creatorcontrib>Naderi, S. ; Bohlouli, M. ; Yin, T. ; König, S.</creatorcontrib><description>Summary Holstein Friesian cow training sets were created according to disease incidences. The different datasets were used to investigate the impact of random forest (RF) and genomic BLUP (GBLUP) methodology on genomic prediction accuracies. In addition, for further verifications of some specific scenarios, single‐step genomic BLUP was applied. Disease traits included the overall trait categories of (i) claw disorders, (ii) clinical mastitis and (iii) infertility from 80 741 first lactation Holstein cows kept in 58 large‐scale herds. A subset of 6744 cows was genotyped (50K SNP panel). Response variables for all scenarios were de‐regressed proofs (DRPs) and pre‐corrected phenotypes (PCPs). Initially, all sick cows were allocated to the testing set, and healthy cows represented the training set. For the ongoing cow allocation schemes, the number of sick cows in the training set increased stepwise by moving 10% of the sick cows from the testing to the training set in each step. The size of training and testing sets was kept constant by replacing the same number of cows in the testing set with (randomly selected) healthy cows from the training set. For both the RF and GBLUP methods, prediction accuracies were larger for DRPs compared to PCPs. For PCPs as a response variable, the largest prediction accuracies were observed when the disease incidences in training sets reflected the disease incidence in the whole population. A further increase in prediction accuracies for some selected cow allocation schemes (i.e. larger prediction accuracies compared to corresponding scenarios with RF or GBLUB) was achieved via single‐step GBLUP applications. Correlations between genome‐wide association study SNP effects and RF importance criteria for single SNPs were in a moderate range, from 0.42 to 0.57, when considering SNPs from all chromosomes or from specific chromosome segments. RF identified significant SNPs close to potential positional candidate genes: GAS1, GPAT3 and CYP2R1 for clinical mastitis; SPINK5 and SLC26A2 for laminitis; and FGF12 for endometritis.</description><identifier>ISSN: 0268-9146</identifier><identifier>EISSN: 1365-2052</identifier><identifier>DOI: 10.1111/age.12661</identifier><identifier>PMID: 29624705</identifier><language>eng</language><publisher>England: Wiley Subscription Services, Inc</publisher><subject>Breeding ; Cattle ; Chromosomes ; Correlation analysis ; Endometritis ; Gas1 protein ; Genome-wide association studies ; Genomes ; genome‐wide associations ; genomic BLUP ; genomic predictions ; Infertility ; Lactation ; Mastitis ; Phenotypes ; random forest ; Single-nucleotide polymorphism ; Training</subject><ispartof>Animal genetics, 2018-06, Vol.49 (3), p.178-192</ispartof><rights>2018 Stichting International Foundation for Animal Genetics</rights><rights>2018 Stichting International Foundation for Animal Genetics.</rights><rights>Copyright © 2018 Stichting International Foundation for Animal Genetics</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c3531-8aba6ce51ad6c081b5a439b8d8de69a21b39b6dc7a7b8dafecd67944c5406a0f3</citedby><cites>FETCH-LOGICAL-c3531-8aba6ce51ad6c081b5a439b8d8de69a21b39b6dc7a7b8dafecd67944c5406a0f3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27923,27924</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29624705$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Naderi, S.</creatorcontrib><creatorcontrib>Bohlouli, M.</creatorcontrib><creatorcontrib>Yin, T.</creatorcontrib><creatorcontrib>König, S.</creatorcontrib><title>Genomic breeding values, SNP effects and gene identification for disease traits in cow training sets</title><title>Animal genetics</title><addtitle>Anim Genet</addtitle><description>Summary Holstein Friesian cow training sets were created according to disease incidences. The different datasets were used to investigate the impact of random forest (RF) and genomic BLUP (GBLUP) methodology on genomic prediction accuracies. In addition, for further verifications of some specific scenarios, single‐step genomic BLUP was applied. Disease traits included the overall trait categories of (i) claw disorders, (ii) clinical mastitis and (iii) infertility from 80 741 first lactation Holstein cows kept in 58 large‐scale herds. A subset of 6744 cows was genotyped (50K SNP panel). Response variables for all scenarios were de‐regressed proofs (DRPs) and pre‐corrected phenotypes (PCPs). Initially, all sick cows were allocated to the testing set, and healthy cows represented the training set. For the ongoing cow allocation schemes, the number of sick cows in the training set increased stepwise by moving 10% of the sick cows from the testing to the training set in each step. The size of training and testing sets was kept constant by replacing the same number of cows in the testing set with (randomly selected) healthy cows from the training set. For both the RF and GBLUP methods, prediction accuracies were larger for DRPs compared to PCPs. For PCPs as a response variable, the largest prediction accuracies were observed when the disease incidences in training sets reflected the disease incidence in the whole population. A further increase in prediction accuracies for some selected cow allocation schemes (i.e. larger prediction accuracies compared to corresponding scenarios with RF or GBLUB) was achieved via single‐step GBLUP applications. Correlations between genome‐wide association study SNP effects and RF importance criteria for single SNPs were in a moderate range, from 0.42 to 0.57, when considering SNPs from all chromosomes or from specific chromosome segments. RF identified significant SNPs close to potential positional candidate genes: GAS1, GPAT3 and CYP2R1 for clinical mastitis; SPINK5 and SLC26A2 for laminitis; and FGF12 for endometritis.</description><subject>Breeding</subject><subject>Cattle</subject><subject>Chromosomes</subject><subject>Correlation analysis</subject><subject>Endometritis</subject><subject>Gas1 protein</subject><subject>Genome-wide association studies</subject><subject>Genomes</subject><subject>genome‐wide associations</subject><subject>genomic BLUP</subject><subject>genomic predictions</subject><subject>Infertility</subject><subject>Lactation</subject><subject>Mastitis</subject><subject>Phenotypes</subject><subject>random forest</subject><subject>Single-nucleotide polymorphism</subject><subject>Training</subject><issn>0268-9146</issn><issn>1365-2052</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNp1kNFKwzAUhoMobk4vfAEJeKNgZ5KmaXspMqcwVFCvy2lyOiJdqkmr-PbGTb0QzE044ePLf35CDjmb8njOYYlTLpTiW2TMU5UlgmVim4yZUEVScqlGZC-EZ8ZYwXO-S0aiVELmLBsTM0fXraymtUc01i3pG7QDhjP6cHtPsWlQ94GCM3SJDqk16HrbWA297RxtOk-NDQgBae_BRtQ6qrv39eS-dAH7sE92GmgDHnzfE_J0NXu8vE4Wd_Oby4tFotMs5UkBNSiNGQejdIxaZyDTsi5MYVCVIHgdJ2V0Dnl8hBjNqLyUUmeSKWBNOiEnG--L717jEn21skFj24LDbgiVYEKUhYifRfT4D_rcDd7FdJGSeaqk5GWkTjeU9l0IHpvqxdsV-I-Ks-qr-ipWX62rj-zRt3GoV2h-yZ-uI3C-Ad5tix__m6qL-Wyj_ARM941I</recordid><startdate>201806</startdate><enddate>201806</enddate><creator>Naderi, S.</creator><creator>Bohlouli, M.</creator><creator>Yin, T.</creator><creator>König, S.</creator><general>Wiley Subscription Services, Inc</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7TK</scope><scope>7U7</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope></search><sort><creationdate>201806</creationdate><title>Genomic breeding values, SNP effects and gene identification for disease traits in cow training sets</title><author>Naderi, S. ; Bohlouli, M. ; Yin, T. ; König, S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3531-8aba6ce51ad6c081b5a439b8d8de69a21b39b6dc7a7b8dafecd67944c5406a0f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Breeding</topic><topic>Cattle</topic><topic>Chromosomes</topic><topic>Correlation analysis</topic><topic>Endometritis</topic><topic>Gas1 protein</topic><topic>Genome-wide association studies</topic><topic>Genomes</topic><topic>genome‐wide associations</topic><topic>genomic BLUP</topic><topic>genomic predictions</topic><topic>Infertility</topic><topic>Lactation</topic><topic>Mastitis</topic><topic>Phenotypes</topic><topic>random forest</topic><topic>Single-nucleotide polymorphism</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Naderi, S.</creatorcontrib><creatorcontrib>Bohlouli, M.</creatorcontrib><creatorcontrib>Yin, T.</creatorcontrib><creatorcontrib>König, S.</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Neurosciences Abstracts</collection><collection>Toxicology Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Animal genetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Naderi, S.</au><au>Bohlouli, M.</au><au>Yin, T.</au><au>König, S.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Genomic breeding values, SNP effects and gene identification for disease traits in cow training sets</atitle><jtitle>Animal genetics</jtitle><addtitle>Anim Genet</addtitle><date>2018-06</date><risdate>2018</risdate><volume>49</volume><issue>3</issue><spage>178</spage><epage>192</epage><pages>178-192</pages><issn>0268-9146</issn><eissn>1365-2052</eissn><abstract>Summary Holstein Friesian cow training sets were created according to disease incidences. The different datasets were used to investigate the impact of random forest (RF) and genomic BLUP (GBLUP) methodology on genomic prediction accuracies. In addition, for further verifications of some specific scenarios, single‐step genomic BLUP was applied. Disease traits included the overall trait categories of (i) claw disorders, (ii) clinical mastitis and (iii) infertility from 80 741 first lactation Holstein cows kept in 58 large‐scale herds. A subset of 6744 cows was genotyped (50K SNP panel). Response variables for all scenarios were de‐regressed proofs (DRPs) and pre‐corrected phenotypes (PCPs). Initially, all sick cows were allocated to the testing set, and healthy cows represented the training set. For the ongoing cow allocation schemes, the number of sick cows in the training set increased stepwise by moving 10% of the sick cows from the testing to the training set in each step. The size of training and testing sets was kept constant by replacing the same number of cows in the testing set with (randomly selected) healthy cows from the training set. For both the RF and GBLUP methods, prediction accuracies were larger for DRPs compared to PCPs. For PCPs as a response variable, the largest prediction accuracies were observed when the disease incidences in training sets reflected the disease incidence in the whole population. A further increase in prediction accuracies for some selected cow allocation schemes (i.e. larger prediction accuracies compared to corresponding scenarios with RF or GBLUB) was achieved via single‐step GBLUP applications. Correlations between genome‐wide association study SNP effects and RF importance criteria for single SNPs were in a moderate range, from 0.42 to 0.57, when considering SNPs from all chromosomes or from specific chromosome segments. RF identified significant SNPs close to potential positional candidate genes: GAS1, GPAT3 and CYP2R1 for clinical mastitis; SPINK5 and SLC26A2 for laminitis; and FGF12 for endometritis.</abstract><cop>England</cop><pub>Wiley Subscription Services, Inc</pub><pmid>29624705</pmid><doi>10.1111/age.12661</doi><tpages>15</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0268-9146
ispartof Animal genetics, 2018-06, Vol.49 (3), p.178-192
issn 0268-9146
1365-2052
language eng
recordid cdi_proquest_miscellaneous_2022982353
source Wiley
subjects Breeding
Cattle
Chromosomes
Correlation analysis
Endometritis
Gas1 protein
Genome-wide association studies
Genomes
genome‐wide associations
genomic BLUP
genomic predictions
Infertility
Lactation
Mastitis
Phenotypes
random forest
Single-nucleotide polymorphism
Training
title Genomic breeding values, SNP effects and gene identification for disease traits in cow training sets
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T02%3A32%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Genomic%20breeding%20values,%20SNP%20effects%20and%20gene%20identification%20for%20disease%20traits%20in%20cow%20training%20sets&rft.jtitle=Animal%20genetics&rft.au=Naderi,%20S.&rft.date=2018-06&rft.volume=49&rft.issue=3&rft.spage=178&rft.epage=192&rft.pages=178-192&rft.issn=0268-9146&rft.eissn=1365-2052&rft_id=info:doi/10.1111/age.12661&rft_dat=%3Cproquest_cross%3E2047364419%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c3531-8aba6ce51ad6c081b5a439b8d8de69a21b39b6dc7a7b8dafecd67944c5406a0f3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2047364419&rft_id=info:pmid/29624705&rfr_iscdi=true