Loading…

Exploring the correlation between DNA methylation and biological age using an interpretable machine learning framework

DNA methylation plays a significant role in regulating transcription and exhibits a systematic change with age. These changes can be used to predict an individual’s age. First, to identify methylation sites associated with biological age; second, to construct a biological age prediction model and pr...

Full description

Saved in:
Bibliographic Details
Published in:Scientific reports 2024-10, Vol.14 (1), p.24208-13, Article 24208
Main Authors: Zhou, Sheng, Chen, Jing, Wei, Shanshan, Zhou, Chengxing, Wang, Die, Yan, Xiaofan, He, Xun, Yan, Pengcheng
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-c422t-85e911dea84c5a11da588745d745293d5ba8a62701115cb64ab8e759a7fac6a23
container_end_page 13
container_issue 1
container_start_page 24208
container_title Scientific reports
container_volume 14
creator Zhou, Sheng
Chen, Jing
Wei, Shanshan
Zhou, Chengxing
Wang, Die
Yan, Xiaofan
He, Xun
Yan, Pengcheng
description DNA methylation plays a significant role in regulating transcription and exhibits a systematic change with age. These changes can be used to predict an individual’s age. First, to identify methylation sites associated with biological age; second, to construct a biological age prediction model and preliminarily explore the biological significance of methylation-associated genes using machine learning. A biological age prediction model was constructed using human methylation data through data preprocessing, feature selection procedures, statistical analysis, and machine learning techniques. Subsequently, 15 methylation data sets were subjected to in-depth analysis using SHAP, GO enrichment, and KEGG analysis. XGBoost, LightGBM, and CatBoost identified 15 groups of methylation sites associated with biological age. The cg23995914 locus was identified as the most significant contributor to predicting biological age by calculating SHAP values. Furthermore, GO enrichment and KEGG analyses were employed to initially explore the methylated loci’s biological significance.
doi_str_mv 10.1038/s41598-024-75586-9
format article
fullrecord <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_9357a049144045f0891a3ccb854302f4</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_9357a049144045f0891a3ccb854302f4</doaj_id><sourcerecordid>3116761136</sourcerecordid><originalsourceid>FETCH-LOGICAL-c422t-85e911dea84c5a11da588745d745293d5ba8a62701115cb64ab8e759a7fac6a23</originalsourceid><addsrcrecordid>eNp9kk1v1DAQhiMEotXSP8ABWeLCJeDPxD6hqhRaqYILnK2JM9n1ktiLnW3pv8fb3ZaWA5ZGHo1fP_aM3qp6zeh7RoX-kCVTRteUy7pVSje1eVYdcypVzQXnzx_lR9VJzmtaluJGMvOyOhJG0ka3zXF1ff57M8bkw5LMKyQupoQjzD4G0uF8gxjIp6-nZMJ5dXuoQ-hJ5-MYl97BSGCJZJt3AAjEhxnTJuEM3YhkArfyAcmIkMJOMSSY8Camn6-qFwOMGU8O-6L68fn8-9lFffXty-XZ6VXtJOdzrRUaxnoELZ2CkoHSupWqL8GN6FUHGhreUsaYcl0jodPYKgPtAK4BLhbV5Z7bR1jbTfITpFsbwdu7QkxLC2n2bkRrhGqBSsOkLJMbqDYMhHOdVlJQPsjC-rhnbbbdhL3DMCcYn0CfngS_sst4bRmTuoBVIbw7EFL8tcU828lnh-MIAeM2W8FYS1uhSyyqt_9I13GbQpnVTtW0DWOiKSq-V7kUc044PPyGUbuzid3bxBab2DublDYX1ZvHfTxcuTdFEYi9IG92xsD09-3_YP8AnmTI7Q</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3116761136</pqid></control><display><type>article</type><title>Exploring the correlation between DNA methylation and biological age using an interpretable machine learning framework</title><source>Full-Text Journals in Chemistry (Open access)</source><source>Publicly Available Content (ProQuest)</source><source>PubMed Central</source><source>Springer Nature - nature.com Journals - Fully Open Access</source><creator>Zhou, Sheng ; Chen, Jing ; Wei, Shanshan ; Zhou, Chengxing ; Wang, Die ; Yan, Xiaofan ; He, Xun ; Yan, Pengcheng</creator><creatorcontrib>Zhou, Sheng ; Chen, Jing ; Wei, Shanshan ; Zhou, Chengxing ; Wang, Die ; Yan, Xiaofan ; He, Xun ; Yan, Pengcheng</creatorcontrib><description>DNA methylation plays a significant role in regulating transcription and exhibits a systematic change with age. These changes can be used to predict an individual’s age. First, to identify methylation sites associated with biological age; second, to construct a biological age prediction model and preliminarily explore the biological significance of methylation-associated genes using machine learning. A biological age prediction model was constructed using human methylation data through data preprocessing, feature selection procedures, statistical analysis, and machine learning techniques. Subsequently, 15 methylation data sets were subjected to in-depth analysis using SHAP, GO enrichment, and KEGG analysis. XGBoost, LightGBM, and CatBoost identified 15 groups of methylation sites associated with biological age. The cg23995914 locus was identified as the most significant contributor to predicting biological age by calculating SHAP values. Furthermore, GO enrichment and KEGG analyses were employed to initially explore the methylated loci’s biological significance.</description><identifier>ISSN: 2045-2322</identifier><identifier>EISSN: 2045-2322</identifier><identifier>DOI: 10.1038/s41598-024-75586-9</identifier><identifier>PMID: 39406876</identifier><language>eng</language><publisher>London: Nature Publishing Group UK</publisher><subject>631/114/1305 ; 631/208/176/1988 ; Age ; Aging ; Aging - genetics ; Biological age ; Biomarkers ; CpG Islands ; Datasets ; Deoxyribonucleic acid ; DNA ; DNA Methylation ; Epigenetics ; Experimental methods ; Gender ; Gene expression ; Gene loci ; Genomes ; GO enrichment analysis ; Health care ; Humanities and Social Sciences ; Humans ; Interpretable machine learning ; Learning algorithms ; Machine Learning ; Male ; Medicine ; multidisciplinary ; Physiology ; Prediction models ; Research methodology ; Science ; Science (multidisciplinary) ; Shapley Additive exPlanations ; Statistical analysis ; Statistical models ; XGBoost</subject><ispartof>Scientific reports, 2024-10, Vol.14 (1), p.24208-13, Article 24208</ispartof><rights>The Author(s) 2024</rights><rights>2024. The Author(s).</rights><rights>The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>The Author(s) 2024 2024</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c422t-85e911dea84c5a11da588745d745293d5ba8a62701115cb64ab8e759a7fac6a23</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/3116761136/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/3116761136?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,25753,27924,27925,37012,37013,44590,53791,53793,74998</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/39406876$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhou, Sheng</creatorcontrib><creatorcontrib>Chen, Jing</creatorcontrib><creatorcontrib>Wei, Shanshan</creatorcontrib><creatorcontrib>Zhou, Chengxing</creatorcontrib><creatorcontrib>Wang, Die</creatorcontrib><creatorcontrib>Yan, Xiaofan</creatorcontrib><creatorcontrib>He, Xun</creatorcontrib><creatorcontrib>Yan, Pengcheng</creatorcontrib><title>Exploring the correlation between DNA methylation and biological age using an interpretable machine learning framework</title><title>Scientific reports</title><addtitle>Sci Rep</addtitle><addtitle>Sci Rep</addtitle><description>DNA methylation plays a significant role in regulating transcription and exhibits a systematic change with age. These changes can be used to predict an individual’s age. First, to identify methylation sites associated with biological age; second, to construct a biological age prediction model and preliminarily explore the biological significance of methylation-associated genes using machine learning. A biological age prediction model was constructed using human methylation data through data preprocessing, feature selection procedures, statistical analysis, and machine learning techniques. Subsequently, 15 methylation data sets were subjected to in-depth analysis using SHAP, GO enrichment, and KEGG analysis. XGBoost, LightGBM, and CatBoost identified 15 groups of methylation sites associated with biological age. The cg23995914 locus was identified as the most significant contributor to predicting biological age by calculating SHAP values. Furthermore, GO enrichment and KEGG analyses were employed to initially explore the methylated loci’s biological significance.</description><subject>631/114/1305</subject><subject>631/208/176/1988</subject><subject>Age</subject><subject>Aging</subject><subject>Aging - genetics</subject><subject>Biological age</subject><subject>Biomarkers</subject><subject>CpG Islands</subject><subject>Datasets</subject><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>DNA Methylation</subject><subject>Epigenetics</subject><subject>Experimental methods</subject><subject>Gender</subject><subject>Gene expression</subject><subject>Gene loci</subject><subject>Genomes</subject><subject>GO enrichment analysis</subject><subject>Health care</subject><subject>Humanities and Social Sciences</subject><subject>Humans</subject><subject>Interpretable machine learning</subject><subject>Learning algorithms</subject><subject>Machine Learning</subject><subject>Male</subject><subject>Medicine</subject><subject>multidisciplinary</subject><subject>Physiology</subject><subject>Prediction models</subject><subject>Research methodology</subject><subject>Science</subject><subject>Science (multidisciplinary)</subject><subject>Shapley Additive exPlanations</subject><subject>Statistical analysis</subject><subject>Statistical models</subject><subject>XGBoost</subject><issn>2045-2322</issn><issn>2045-2322</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNp9kk1v1DAQhiMEotXSP8ABWeLCJeDPxD6hqhRaqYILnK2JM9n1ktiLnW3pv8fb3ZaWA5ZGHo1fP_aM3qp6zeh7RoX-kCVTRteUy7pVSje1eVYdcypVzQXnzx_lR9VJzmtaluJGMvOyOhJG0ka3zXF1ff57M8bkw5LMKyQupoQjzD4G0uF8gxjIp6-nZMJ5dXuoQ-hJ5-MYl97BSGCJZJt3AAjEhxnTJuEM3YhkArfyAcmIkMJOMSSY8Camn6-qFwOMGU8O-6L68fn8-9lFffXty-XZ6VXtJOdzrRUaxnoELZ2CkoHSupWqL8GN6FUHGhreUsaYcl0jodPYKgPtAK4BLhbV5Z7bR1jbTfITpFsbwdu7QkxLC2n2bkRrhGqBSsOkLJMbqDYMhHOdVlJQPsjC-rhnbbbdhL3DMCcYn0CfngS_sst4bRmTuoBVIbw7EFL8tcU828lnh-MIAeM2W8FYS1uhSyyqt_9I13GbQpnVTtW0DWOiKSq-V7kUc044PPyGUbuzid3bxBab2DublDYX1ZvHfTxcuTdFEYi9IG92xsD09-3_YP8AnmTI7Q</recordid><startdate>20241015</startdate><enddate>20241015</enddate><creator>Zhou, Sheng</creator><creator>Chen, Jing</creator><creator>Wei, Shanshan</creator><creator>Zhou, Chengxing</creator><creator>Wang, Die</creator><creator>Yan, Xiaofan</creator><creator>He, Xun</creator><creator>Yan, Pengcheng</creator><general>Nature Publishing Group UK</general><general>Nature Publishing Group</general><general>Nature Portfolio</general><scope>C6C</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X7</scope><scope>7XB</scope><scope>88A</scope><scope>88E</scope><scope>88I</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M2P</scope><scope>M7P</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20241015</creationdate><title>Exploring the correlation between DNA methylation and biological age using an interpretable machine learning framework</title><author>Zhou, Sheng ; Chen, Jing ; Wei, Shanshan ; Zhou, Chengxing ; Wang, Die ; Yan, Xiaofan ; He, Xun ; Yan, Pengcheng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c422t-85e911dea84c5a11da588745d745293d5ba8a62701115cb64ab8e759a7fac6a23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>631/114/1305</topic><topic>631/208/176/1988</topic><topic>Age</topic><topic>Aging</topic><topic>Aging - genetics</topic><topic>Biological age</topic><topic>Biomarkers</topic><topic>CpG Islands</topic><topic>Datasets</topic><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>DNA Methylation</topic><topic>Epigenetics</topic><topic>Experimental methods</topic><topic>Gender</topic><topic>Gene expression</topic><topic>Gene loci</topic><topic>Genomes</topic><topic>GO enrichment analysis</topic><topic>Health care</topic><topic>Humanities and Social Sciences</topic><topic>Humans</topic><topic>Interpretable machine learning</topic><topic>Learning algorithms</topic><topic>Machine Learning</topic><topic>Male</topic><topic>Medicine</topic><topic>multidisciplinary</topic><topic>Physiology</topic><topic>Prediction models</topic><topic>Research methodology</topic><topic>Science</topic><topic>Science (multidisciplinary)</topic><topic>Shapley Additive exPlanations</topic><topic>Statistical analysis</topic><topic>Statistical models</topic><topic>XGBoost</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Sheng</creatorcontrib><creatorcontrib>Chen, Jing</creatorcontrib><creatorcontrib>Wei, Shanshan</creatorcontrib><creatorcontrib>Zhou, Chengxing</creatorcontrib><creatorcontrib>Wang, Die</creatorcontrib><creatorcontrib>Yan, Xiaofan</creatorcontrib><creatorcontrib>He, Xun</creatorcontrib><creatorcontrib>Yan, Pengcheng</creatorcontrib><collection>SpringerOpen</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Biology Database (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Biological Sciences</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>ProQuest Science Journals</collection><collection>Biological Science Database</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>Directory of Open Access Journals</collection><jtitle>Scientific reports</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhou, Sheng</au><au>Chen, Jing</au><au>Wei, Shanshan</au><au>Zhou, Chengxing</au><au>Wang, Die</au><au>Yan, Xiaofan</au><au>He, Xun</au><au>Yan, Pengcheng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Exploring the correlation between DNA methylation and biological age using an interpretable machine learning framework</atitle><jtitle>Scientific reports</jtitle><stitle>Sci Rep</stitle><addtitle>Sci Rep</addtitle><date>2024-10-15</date><risdate>2024</risdate><volume>14</volume><issue>1</issue><spage>24208</spage><epage>13</epage><pages>24208-13</pages><artnum>24208</artnum><issn>2045-2322</issn><eissn>2045-2322</eissn><abstract>DNA methylation plays a significant role in regulating transcription and exhibits a systematic change with age. These changes can be used to predict an individual’s age. First, to identify methylation sites associated with biological age; second, to construct a biological age prediction model and preliminarily explore the biological significance of methylation-associated genes using machine learning. A biological age prediction model was constructed using human methylation data through data preprocessing, feature selection procedures, statistical analysis, and machine learning techniques. Subsequently, 15 methylation data sets were subjected to in-depth analysis using SHAP, GO enrichment, and KEGG analysis. XGBoost, LightGBM, and CatBoost identified 15 groups of methylation sites associated with biological age. The cg23995914 locus was identified as the most significant contributor to predicting biological age by calculating SHAP values. Furthermore, GO enrichment and KEGG analyses were employed to initially explore the methylated loci’s biological significance.</abstract><cop>London</cop><pub>Nature Publishing Group UK</pub><pmid>39406876</pmid><doi>10.1038/s41598-024-75586-9</doi><tpages>13</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2045-2322
ispartof Scientific reports, 2024-10, Vol.14 (1), p.24208-13, Article 24208
issn 2045-2322
2045-2322
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_9357a049144045f0891a3ccb854302f4
source Full-Text Journals in Chemistry (Open access); Publicly Available Content (ProQuest); PubMed Central; Springer Nature - nature.com Journals - Fully Open Access
subjects 631/114/1305
631/208/176/1988
Age
Aging
Aging - genetics
Biological age
Biomarkers
CpG Islands
Datasets
Deoxyribonucleic acid
DNA
DNA Methylation
Epigenetics
Experimental methods
Gender
Gene expression
Gene loci
Genomes
GO enrichment analysis
Health care
Humanities and Social Sciences
Humans
Interpretable machine learning
Learning algorithms
Machine Learning
Male
Medicine
multidisciplinary
Physiology
Prediction models
Research methodology
Science
Science (multidisciplinary)
Shapley Additive exPlanations
Statistical analysis
Statistical models
XGBoost
title Exploring the correlation between DNA methylation and biological age using an interpretable machine learning framework
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T23%3A26%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Exploring%20the%20correlation%20between%20DNA%20methylation%20and%20biological%20age%20using%20an%20interpretable%20machine%20learning%20framework&rft.jtitle=Scientific%20reports&rft.au=Zhou,%20Sheng&rft.date=2024-10-15&rft.volume=14&rft.issue=1&rft.spage=24208&rft.epage=13&rft.pages=24208-13&rft.artnum=24208&rft.issn=2045-2322&rft.eissn=2045-2322&rft_id=info:doi/10.1038/s41598-024-75586-9&rft_dat=%3Cproquest_doaj_%3E3116761136%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c422t-85e911dea84c5a11da588745d745293d5ba8a62701115cb64ab8e759a7fac6a23%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3116761136&rft_id=info:pmid/39406876&rfr_iscdi=true