Loading…

Type 2 Diabetes Biomarkers of Human Gut Microbiota Selected via Iterative Sure Independent Screening Method

Type 2 diabetes, which is a complex metabolic disease influenced by genetic and environment, has become a worldwide problem. Previous published results focused on genetic components through genome-wide association studies that just interpret this disease to some extent. Recently, two research groups...

Full description

Saved in:
Bibliographic Details
Published in:PloS one 2015-10, Vol.10 (10), p.e0140827-e0140827
Main Authors: Cai, Lihua, Wu, Honglong, Li, Dongfang, Zhou, Ke, Zou, Fuhao
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c692t-1f7523079c6b9fbd7898f32a0a22c4fb3b87a057d20d84ff9839cfa6bce4deba3
cites cdi_FETCH-LOGICAL-c692t-1f7523079c6b9fbd7898f32a0a22c4fb3b87a057d20d84ff9839cfa6bce4deba3
container_end_page e0140827
container_issue 10
container_start_page e0140827
container_title PloS one
container_volume 10
creator Cai, Lihua
Wu, Honglong
Li, Dongfang
Zhou, Ke
Zou, Fuhao
description Type 2 diabetes, which is a complex metabolic disease influenced by genetic and environment, has become a worldwide problem. Previous published results focused on genetic components through genome-wide association studies that just interpret this disease to some extent. Recently, two research groups published metagenome-wide association studies (MGWAS) result that found meta-biomarkers related with type 2 diabetes. However, One key problem of analyzing genomic data is that how to deal with the ultra-high dimensionality of features. From a statistical viewpoint it is challenging to filter true factors in high dimensional data. Various methods and techniques have been proposed on this issue, which can only achieve limited prediction performance and poor interpretability. New statistical procedure with higher performance and clear interpretability is appealing in analyzing high dimensional data. To address this problem, we apply an excellent statistical variable selection procedure called iterative sure independence screening to gene profiles that obtained from metagenome sequencing, and 48/24 meta-markers were selected in Chinese/European cohorts as predictors with 0.97/0.99 accuracy in AUC (area under the curve), which showed a better performance than other model selection methods, respectively. These results demonstrate the power and utility of data mining technologies within the large-scale and ultra-high dimensional genomic-related dataset for diagnostic and predictive markers identifying.
doi_str_mv 10.1371/journal.pone.0140827
format article
fullrecord <record><control><sourceid>gale_plos_</sourceid><recordid>TN_cdi_plos_journals_1723736772</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A432031372</galeid><doaj_id>oai_doaj_org_article_376bd536ad454d688f8c10b7c357769e</doaj_id><sourcerecordid>A432031372</sourcerecordid><originalsourceid>FETCH-LOGICAL-c692t-1f7523079c6b9fbd7898f32a0a22c4fb3b87a057d20d84ff9839cfa6bce4deba3</originalsourceid><addsrcrecordid>eNqNk1Fv0zAQxyMEYqPwDRBYQkLw0OLYiZ28II0BW6VNk-jg1XLsc-sujYvtVOzb467Z1KI9IEu2Zf_uf77zXZa9zvEkpzz_tHS972Q7WbsOJjgvcEX4k-w4rykZM4Lp0739UfYihCXGJa0Ye54dEVbwmhN2nN1c364BEfTVygYiBPTFupX0N-ADcgad9yvZobM-okurvGusixLNoAUVQaONlWgawctoN4BmvQc07TSsIU1dRDPlATrbzdElxIXTL7NnRrYBXg3rKPv5_dv16fn44upsenpyMVasJnGcG14SinmtWFObRvOqrgwlEktCVGEa2lRc4pJrgnVVGFNXtFZGskZBoaGRdJS93emuWxfEkKcgck4op4ynZZRNd4R2cinW3qaQb4WTVtwdOD8X0kerWhCUs0aXlEldlIVmVWUqleOGK1pyzmpIWp8Hb32zAq1S5F62B6KHN51diLnbiILlmGOWBD4MAt797iFEsbJBQdvKDlx_9-4SE8owTei7f9DHoxuouUwB2M645FdtRcVJQVM5pPrZUpNHqDQ0rKxKRWVsOj8w-HhgkJgIf-Jc9iGI6ezH_7NXvw7Z93vsAmQbF8G1fbSuC4dgsQNTJYbgwTwkOcdi2xP32RDbnhBDTySzN_sf9GB03wT0L_7lBkM</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1723736772</pqid></control><display><type>article</type><title>Type 2 Diabetes Biomarkers of Human Gut Microbiota Selected via Iterative Sure Independent Screening Method</title><source>Publicly Available Content Database</source><source>PubMed Central</source><creator>Cai, Lihua ; Wu, Honglong ; Li, Dongfang ; Zhou, Ke ; Zou, Fuhao</creator><contributor>Zhu, Dongxiao</contributor><creatorcontrib>Cai, Lihua ; Wu, Honglong ; Li, Dongfang ; Zhou, Ke ; Zou, Fuhao ; Zhu, Dongxiao</creatorcontrib><description>Type 2 diabetes, which is a complex metabolic disease influenced by genetic and environment, has become a worldwide problem. Previous published results focused on genetic components through genome-wide association studies that just interpret this disease to some extent. Recently, two research groups published metagenome-wide association studies (MGWAS) result that found meta-biomarkers related with type 2 diabetes. However, One key problem of analyzing genomic data is that how to deal with the ultra-high dimensionality of features. From a statistical viewpoint it is challenging to filter true factors in high dimensional data. Various methods and techniques have been proposed on this issue, which can only achieve limited prediction performance and poor interpretability. New statistical procedure with higher performance and clear interpretability is appealing in analyzing high dimensional data. To address this problem, we apply an excellent statistical variable selection procedure called iterative sure independence screening to gene profiles that obtained from metagenome sequencing, and 48/24 meta-markers were selected in Chinese/European cohorts as predictors with 0.97/0.99 accuracy in AUC (area under the curve), which showed a better performance than other model selection methods, respectively. These results demonstrate the power and utility of data mining technologies within the large-scale and ultra-high dimensional genomic-related dataset for diagnostic and predictive markers identifying.</description><identifier>ISSN: 1932-6203</identifier><identifier>EISSN: 1932-6203</identifier><identifier>DOI: 10.1371/journal.pone.0140827</identifier><identifier>PMID: 26479726</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Aged ; Analysis ; Bioindicators ; Bioinformatics ; Biological markers ; Biomarkers ; Care and treatment ; Data analysis ; Data mining ; Data processing ; Datasets ; Diabetes ; Diabetes mellitus ; Diabetes mellitus (non-insulin dependent) ; Diabetes Mellitus, Type 2 - genetics ; Diabetes Mellitus, Type 2 - microbiology ; Diagnosis ; Diagnostic systems ; Dimensional analysis ; DNA methylation ; Female ; Gastrointestinal Microbiome - genetics ; Gene expression ; Gene sequencing ; Genetic Markers - genetics ; Genome-wide association studies ; Genome-Wide Association Study ; Genomes ; Genomics ; Humans ; Intestinal microflora ; Iterative methods ; Laboratories ; Male ; Medical screening ; Microbiota ; Microbiota (Symbiotic organisms) ; Middle Aged ; Predictions ; Regularization methods ; Researchers ; Risk factors ; Science ; Signal transduction ; Sparsity ; Statistics ; Type 2 diabetes</subject><ispartof>PloS one, 2015-10, Vol.10 (10), p.e0140827-e0140827</ispartof><rights>COPYRIGHT 2015 Public Library of Science</rights><rights>2015 Cai et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>2015 Cai et al 2015 Cai et al</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c692t-1f7523079c6b9fbd7898f32a0a22c4fb3b87a057d20d84ff9839cfa6bce4deba3</citedby><cites>FETCH-LOGICAL-c692t-1f7523079c6b9fbd7898f32a0a22c4fb3b87a057d20d84ff9839cfa6bce4deba3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/1723736772/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/1723736772?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,725,778,782,883,25736,27907,27908,36995,36996,44573,53774,53776,74877</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/26479726$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Zhu, Dongxiao</contributor><creatorcontrib>Cai, Lihua</creatorcontrib><creatorcontrib>Wu, Honglong</creatorcontrib><creatorcontrib>Li, Dongfang</creatorcontrib><creatorcontrib>Zhou, Ke</creatorcontrib><creatorcontrib>Zou, Fuhao</creatorcontrib><title>Type 2 Diabetes Biomarkers of Human Gut Microbiota Selected via Iterative Sure Independent Screening Method</title><title>PloS one</title><addtitle>PLoS One</addtitle><description>Type 2 diabetes, which is a complex metabolic disease influenced by genetic and environment, has become a worldwide problem. Previous published results focused on genetic components through genome-wide association studies that just interpret this disease to some extent. Recently, two research groups published metagenome-wide association studies (MGWAS) result that found meta-biomarkers related with type 2 diabetes. However, One key problem of analyzing genomic data is that how to deal with the ultra-high dimensionality of features. From a statistical viewpoint it is challenging to filter true factors in high dimensional data. Various methods and techniques have been proposed on this issue, which can only achieve limited prediction performance and poor interpretability. New statistical procedure with higher performance and clear interpretability is appealing in analyzing high dimensional data. To address this problem, we apply an excellent statistical variable selection procedure called iterative sure independence screening to gene profiles that obtained from metagenome sequencing, and 48/24 meta-markers were selected in Chinese/European cohorts as predictors with 0.97/0.99 accuracy in AUC (area under the curve), which showed a better performance than other model selection methods, respectively. These results demonstrate the power and utility of data mining technologies within the large-scale and ultra-high dimensional genomic-related dataset for diagnostic and predictive markers identifying.</description><subject>Aged</subject><subject>Analysis</subject><subject>Bioindicators</subject><subject>Bioinformatics</subject><subject>Biological markers</subject><subject>Biomarkers</subject><subject>Care and treatment</subject><subject>Data analysis</subject><subject>Data mining</subject><subject>Data processing</subject><subject>Datasets</subject><subject>Diabetes</subject><subject>Diabetes mellitus</subject><subject>Diabetes mellitus (non-insulin dependent)</subject><subject>Diabetes Mellitus, Type 2 - genetics</subject><subject>Diabetes Mellitus, Type 2 - microbiology</subject><subject>Diagnosis</subject><subject>Diagnostic systems</subject><subject>Dimensional analysis</subject><subject>DNA methylation</subject><subject>Female</subject><subject>Gastrointestinal Microbiome - genetics</subject><subject>Gene expression</subject><subject>Gene sequencing</subject><subject>Genetic Markers - genetics</subject><subject>Genome-wide association studies</subject><subject>Genome-Wide Association Study</subject><subject>Genomes</subject><subject>Genomics</subject><subject>Humans</subject><subject>Intestinal microflora</subject><subject>Iterative methods</subject><subject>Laboratories</subject><subject>Male</subject><subject>Medical screening</subject><subject>Microbiota</subject><subject>Microbiota (Symbiotic organisms)</subject><subject>Middle Aged</subject><subject>Predictions</subject><subject>Regularization methods</subject><subject>Researchers</subject><subject>Risk factors</subject><subject>Science</subject><subject>Signal transduction</subject><subject>Sparsity</subject><subject>Statistics</subject><subject>Type 2 diabetes</subject><issn>1932-6203</issn><issn>1932-6203</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNqNk1Fv0zAQxyMEYqPwDRBYQkLw0OLYiZ28II0BW6VNk-jg1XLsc-sujYvtVOzb467Z1KI9IEu2Zf_uf77zXZa9zvEkpzz_tHS972Q7WbsOJjgvcEX4k-w4rykZM4Lp0739UfYihCXGJa0Ye54dEVbwmhN2nN1c364BEfTVygYiBPTFupX0N-ADcgad9yvZobM-okurvGusixLNoAUVQaONlWgawctoN4BmvQc07TSsIU1dRDPlATrbzdElxIXTL7NnRrYBXg3rKPv5_dv16fn44upsenpyMVasJnGcG14SinmtWFObRvOqrgwlEktCVGEa2lRc4pJrgnVVGFNXtFZGskZBoaGRdJS93emuWxfEkKcgck4op4ynZZRNd4R2cinW3qaQb4WTVtwdOD8X0kerWhCUs0aXlEldlIVmVWUqleOGK1pyzmpIWp8Hb32zAq1S5F62B6KHN51diLnbiILlmGOWBD4MAt797iFEsbJBQdvKDlx_9-4SE8owTei7f9DHoxuouUwB2M645FdtRcVJQVM5pPrZUpNHqDQ0rKxKRWVsOj8w-HhgkJgIf-Jc9iGI6ezH_7NXvw7Z93vsAmQbF8G1fbSuC4dgsQNTJYbgwTwkOcdi2xP32RDbnhBDTySzN_sf9GB03wT0L_7lBkM</recordid><startdate>20151019</startdate><enddate>20151019</enddate><creator>Cai, Lihua</creator><creator>Wu, Honglong</creator><creator>Li, Dongfang</creator><creator>Zhou, Ke</creator><creator>Zou, Fuhao</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>IOV</scope><scope>ISR</scope><scope>3V.</scope><scope>7QG</scope><scope>7QL</scope><scope>7QO</scope><scope>7RV</scope><scope>7SN</scope><scope>7SS</scope><scope>7T5</scope><scope>7TG</scope><scope>7TM</scope><scope>7U9</scope><scope>7X2</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AO</scope><scope>8C1</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>D1I</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>H94</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>KB.</scope><scope>KB0</scope><scope>KL.</scope><scope>L6V</scope><scope>LK8</scope><scope>M0K</scope><scope>M0S</scope><scope>M1P</scope><scope>M7N</scope><scope>M7P</scope><scope>M7S</scope><scope>NAPCQ</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PATMY</scope><scope>PDBOC</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>PYCSY</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20151019</creationdate><title>Type 2 Diabetes Biomarkers of Human Gut Microbiota Selected via Iterative Sure Independent Screening Method</title><author>Cai, Lihua ; Wu, Honglong ; Li, Dongfang ; Zhou, Ke ; Zou, Fuhao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c692t-1f7523079c6b9fbd7898f32a0a22c4fb3b87a057d20d84ff9839cfa6bce4deba3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Aged</topic><topic>Analysis</topic><topic>Bioindicators</topic><topic>Bioinformatics</topic><topic>Biological markers</topic><topic>Biomarkers</topic><topic>Care and treatment</topic><topic>Data analysis</topic><topic>Data mining</topic><topic>Data processing</topic><topic>Datasets</topic><topic>Diabetes</topic><topic>Diabetes mellitus</topic><topic>Diabetes mellitus (non-insulin dependent)</topic><topic>Diabetes Mellitus, Type 2 - genetics</topic><topic>Diabetes Mellitus, Type 2 - microbiology</topic><topic>Diagnosis</topic><topic>Diagnostic systems</topic><topic>Dimensional analysis</topic><topic>DNA methylation</topic><topic>Female</topic><topic>Gastrointestinal Microbiome - genetics</topic><topic>Gene expression</topic><topic>Gene sequencing</topic><topic>Genetic Markers - genetics</topic><topic>Genome-wide association studies</topic><topic>Genome-Wide Association Study</topic><topic>Genomes</topic><topic>Genomics</topic><topic>Humans</topic><topic>Intestinal microflora</topic><topic>Iterative methods</topic><topic>Laboratories</topic><topic>Male</topic><topic>Medical screening</topic><topic>Microbiota</topic><topic>Microbiota (Symbiotic organisms)</topic><topic>Middle Aged</topic><topic>Predictions</topic><topic>Regularization methods</topic><topic>Researchers</topic><topic>Risk factors</topic><topic>Science</topic><topic>Signal transduction</topic><topic>Sparsity</topic><topic>Statistics</topic><topic>Type 2 diabetes</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Cai, Lihua</creatorcontrib><creatorcontrib>Wu, Honglong</creatorcontrib><creatorcontrib>Li, Dongfang</creatorcontrib><creatorcontrib>Zhou, Ke</creatorcontrib><creatorcontrib>Zou, Fuhao</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Opposing Viewpoints Resource Center</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Animal Behavior Abstracts</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Biotechnology Research Abstracts</collection><collection>Nursing &amp; Allied Health Database</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Immunology Abstracts</collection><collection>Meteorological &amp; Geoastrophysical Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Agricultural Science Collection</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Public Health Database</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>Agricultural &amp; Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Materials Science Collection</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Materials Science Database</collection><collection>Nursing &amp; Allied Health Database (Alumni Edition)</collection><collection>Meteorological &amp; Geoastrophysical Abstracts - Academic</collection><collection>ProQuest Engineering Collection</collection><collection>Biological Sciences</collection><collection>Agriculture Science Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>PML(ProQuest Medical Library)</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biological Science Database</collection><collection>Engineering Database</collection><collection>Nursing &amp; Allied Health Premium</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Environmental Science Database</collection><collection>Materials Science Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>Environmental Science Collection</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PloS one</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Cai, Lihua</au><au>Wu, Honglong</au><au>Li, Dongfang</au><au>Zhou, Ke</au><au>Zou, Fuhao</au><au>Zhu, Dongxiao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Type 2 Diabetes Biomarkers of Human Gut Microbiota Selected via Iterative Sure Independent Screening Method</atitle><jtitle>PloS one</jtitle><addtitle>PLoS One</addtitle><date>2015-10-19</date><risdate>2015</risdate><volume>10</volume><issue>10</issue><spage>e0140827</spage><epage>e0140827</epage><pages>e0140827-e0140827</pages><issn>1932-6203</issn><eissn>1932-6203</eissn><abstract>Type 2 diabetes, which is a complex metabolic disease influenced by genetic and environment, has become a worldwide problem. Previous published results focused on genetic components through genome-wide association studies that just interpret this disease to some extent. Recently, two research groups published metagenome-wide association studies (MGWAS) result that found meta-biomarkers related with type 2 diabetes. However, One key problem of analyzing genomic data is that how to deal with the ultra-high dimensionality of features. From a statistical viewpoint it is challenging to filter true factors in high dimensional data. Various methods and techniques have been proposed on this issue, which can only achieve limited prediction performance and poor interpretability. New statistical procedure with higher performance and clear interpretability is appealing in analyzing high dimensional data. To address this problem, we apply an excellent statistical variable selection procedure called iterative sure independence screening to gene profiles that obtained from metagenome sequencing, and 48/24 meta-markers were selected in Chinese/European cohorts as predictors with 0.97/0.99 accuracy in AUC (area under the curve), which showed a better performance than other model selection methods, respectively. These results demonstrate the power and utility of data mining technologies within the large-scale and ultra-high dimensional genomic-related dataset for diagnostic and predictive markers identifying.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>26479726</pmid><doi>10.1371/journal.pone.0140827</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1932-6203
ispartof PloS one, 2015-10, Vol.10 (10), p.e0140827-e0140827
issn 1932-6203
1932-6203
language eng
recordid cdi_plos_journals_1723736772
source Publicly Available Content Database; PubMed Central
subjects Aged
Analysis
Bioindicators
Bioinformatics
Biological markers
Biomarkers
Care and treatment
Data analysis
Data mining
Data processing
Datasets
Diabetes
Diabetes mellitus
Diabetes mellitus (non-insulin dependent)
Diabetes Mellitus, Type 2 - genetics
Diabetes Mellitus, Type 2 - microbiology
Diagnosis
Diagnostic systems
Dimensional analysis
DNA methylation
Female
Gastrointestinal Microbiome - genetics
Gene expression
Gene sequencing
Genetic Markers - genetics
Genome-wide association studies
Genome-Wide Association Study
Genomes
Genomics
Humans
Intestinal microflora
Iterative methods
Laboratories
Male
Medical screening
Microbiota
Microbiota (Symbiotic organisms)
Middle Aged
Predictions
Regularization methods
Researchers
Risk factors
Science
Signal transduction
Sparsity
Statistics
Type 2 diabetes
title Type 2 Diabetes Biomarkers of Human Gut Microbiota Selected via Iterative Sure Independent Screening Method
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T18%3A42%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Type%202%20Diabetes%20Biomarkers%20of%20Human%20Gut%20Microbiota%20Selected%20via%20Iterative%20Sure%20Independent%20Screening%20Method&rft.jtitle=PloS%20one&rft.au=Cai,%20Lihua&rft.date=2015-10-19&rft.volume=10&rft.issue=10&rft.spage=e0140827&rft.epage=e0140827&rft.pages=e0140827-e0140827&rft.issn=1932-6203&rft.eissn=1932-6203&rft_id=info:doi/10.1371/journal.pone.0140827&rft_dat=%3Cgale_plos_%3EA432031372%3C/gale_plos_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c692t-1f7523079c6b9fbd7898f32a0a22c4fb3b87a057d20d84ff9839cfa6bce4deba3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1723736772&rft_id=info:pmid/26479726&rft_galeid=A432031372&rfr_iscdi=true