Loading…
Machine learning and bioinformatic analysis of brain and blood mRNA profiles in major depressive disorder: A case–control study
This study analyzed gene expression messenger RNA data, from cases with major depressive disorder (MDD) and controls, using supervised machine learning (ML). We built on the methodology of prior studies to obtain more generalizable/reproducible results. First, we obtained a classifier trained on gen...
Saved in:
Published in: | American journal of medical genetics. Part B, Neuropsychiatric genetics Neuropsychiatric genetics, 2021-03, Vol.186 (2), p.101-112 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c3649-f071b3753b9c010345b197f403b383790a539deebc0813145c50d543af0186823 |
---|---|
cites | cdi_FETCH-LOGICAL-c3649-f071b3753b9c010345b197f403b383790a539deebc0813145c50d543af0186823 |
container_end_page | 112 |
container_issue | 2 |
container_start_page | 101 |
container_title | American journal of medical genetics. Part B, Neuropsychiatric genetics |
container_volume | 186 |
creator | Qi, Bill Ramamurthy, Janani Bennani, Imane Trakadis, Yannis J. |
description | This study analyzed gene expression messenger RNA data, from cases with major depressive disorder (MDD) and controls, using supervised machine learning (ML). We built on the methodology of prior studies to obtain more generalizable/reproducible results. First, we obtained a classifier trained on gene expression data from the dorsolateral prefrontal cortex of post‐mortem MDD cases (n = 126) and controls (n = 103). An average area‐under‐the‐receiver‐operating‐characteristics‐curve (AUC) from 10‐fold cross‐validation of 0.72 was noted, compared to an average AUC of 0.55 for a baseline classifier (p = .0048). The classifier achieved an AUC of 0.76 on a previously unused testing‐set. We also performed external validation using DLPFC gene expression values from an independent cohort of matched MDD cases (n = 29) and controls (n = 29), obtained from Affymetrix microarray (vs. Illumina microarray for the original cohort) (AUC: 0.62). We highlighted gene sets differentially expressed in MDD that were enriched for genes identified by the ML algorithm. Next, we assessed the ML classification performance in blood‐based microarray gene expression data from MDD cases (n = 1,581) and controls (n = 369). We observed a mean AUC of 0.64 on 10‐fold cross‐validation, which was significantly above baseline (p = .0020). Similar performance was observed on the testing‐set (AUC: 0.61). Finally, we analyzed the classification performance in covariates subgroups. We identified an interesting interaction between smoking and recall performance in MDD case prediction (58% accurate predictions in cases who are smokers vs. 43% accurate predictions in cases who are non‐smokers). Overall, our results suggest that ML in combination with gene expression data and covariates could further our understanding of the pathophysiology in MDD. |
doi_str_mv | 10.1002/ajmg.b.32839 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2494878234</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2494878234</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3649-f071b3753b9c010345b197f403b383790a539deebc0813145c50d543af0186823</originalsourceid><addsrcrecordid>eNp90btKBDEUBuAgivfOWgI2Fu6aTBJnYrcuXvEComAXchvNMjNZkx1lO30G39AnMe6sFhZWCcnHz-H8AGxh1McIZftyVD_2VZ9kBeELYBUzlvVowR4Wf-8Ur4C1GEcIEcTyfBmsEHJAGUfFKni_kvrJNRZWVobGNY9QNgYq511T-lDLidPpRVbT6CL0JVRBuqYzlfcG1rfXAzgOvnSVjTB91XLkAzR2HGyM7sVC46IPxoZDOIBaRvv59qF9Mwm-gnHSmukGWCplFe3m_FwH9yfHd8Oz3uXN6flwcNnTaVjeK1GOFckZUVwjjAhlCvO8pIgoUpCcI8kIN9YqjQpMMGWaIcMokSXCxUGRkXWw2-WmaZ9bGyeidlHbqpKN9W0UGeW0yBOkie78oSPfhrSFmeIZZSk_qb1O6eBjDLYU4-BqGaYCI_FdjfiuRigxqybx7Xloq2prfvFPFwnQDrymXU7_DRODi6vToy73C3xkmzs</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2499245145</pqid></control><display><type>article</type><title>Machine learning and bioinformatic analysis of brain and blood mRNA profiles in major depressive disorder: A case–control study</title><source>Wiley-Blackwell Read & Publish Collection</source><creator>Qi, Bill ; Ramamurthy, Janani ; Bennani, Imane ; Trakadis, Yannis J.</creator><creatorcontrib>Qi, Bill ; Ramamurthy, Janani ; Bennani, Imane ; Trakadis, Yannis J.</creatorcontrib><description>This study analyzed gene expression messenger RNA data, from cases with major depressive disorder (MDD) and controls, using supervised machine learning (ML). We built on the methodology of prior studies to obtain more generalizable/reproducible results. First, we obtained a classifier trained on gene expression data from the dorsolateral prefrontal cortex of post‐mortem MDD cases (n = 126) and controls (n = 103). An average area‐under‐the‐receiver‐operating‐characteristics‐curve (AUC) from 10‐fold cross‐validation of 0.72 was noted, compared to an average AUC of 0.55 for a baseline classifier (p = .0048). The classifier achieved an AUC of 0.76 on a previously unused testing‐set. We also performed external validation using DLPFC gene expression values from an independent cohort of matched MDD cases (n = 29) and controls (n = 29), obtained from Affymetrix microarray (vs. Illumina microarray for the original cohort) (AUC: 0.62). We highlighted gene sets differentially expressed in MDD that were enriched for genes identified by the ML algorithm. Next, we assessed the ML classification performance in blood‐based microarray gene expression data from MDD cases (n = 1,581) and controls (n = 369). We observed a mean AUC of 0.64 on 10‐fold cross‐validation, which was significantly above baseline (p = .0020). Similar performance was observed on the testing‐set (AUC: 0.61). Finally, we analyzed the classification performance in covariates subgroups. We identified an interesting interaction between smoking and recall performance in MDD case prediction (58% accurate predictions in cases who are smokers vs. 43% accurate predictions in cases who are non‐smokers). Overall, our results suggest that ML in combination with gene expression data and covariates could further our understanding of the pathophysiology in MDD.</description><identifier>ISSN: 1552-4841</identifier><identifier>EISSN: 1552-485X</identifier><identifier>DOI: 10.1002/ajmg.b.32839</identifier><identifier>PMID: 33645908</identifier><language>eng</language><publisher>Hoboken, USA: John Wiley & Sons, Inc</publisher><subject>bioinformatics ; DNA microarrays ; Gene expression ; Genetics ; Learning algorithms ; Machine learning ; major depression ; Mental depression ; Predictions ; Prefrontal cortex ; transcriptomics</subject><ispartof>American journal of medical genetics. Part B, Neuropsychiatric genetics, 2021-03, Vol.186 (2), p.101-112</ispartof><rights>2021 Wiley Periodicals LLC.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c3649-f071b3753b9c010345b197f403b383790a539deebc0813145c50d543af0186823</citedby><cites>FETCH-LOGICAL-c3649-f071b3753b9c010345b197f403b383790a539deebc0813145c50d543af0186823</cites><orcidid>0000-0002-8740-4416</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27923,27924</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33645908$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Qi, Bill</creatorcontrib><creatorcontrib>Ramamurthy, Janani</creatorcontrib><creatorcontrib>Bennani, Imane</creatorcontrib><creatorcontrib>Trakadis, Yannis J.</creatorcontrib><title>Machine learning and bioinformatic analysis of brain and blood mRNA profiles in major depressive disorder: A case–control study</title><title>American journal of medical genetics. Part B, Neuropsychiatric genetics</title><addtitle>Am J Med Genet B Neuropsychiatr Genet</addtitle><description>This study analyzed gene expression messenger RNA data, from cases with major depressive disorder (MDD) and controls, using supervised machine learning (ML). We built on the methodology of prior studies to obtain more generalizable/reproducible results. First, we obtained a classifier trained on gene expression data from the dorsolateral prefrontal cortex of post‐mortem MDD cases (n = 126) and controls (n = 103). An average area‐under‐the‐receiver‐operating‐characteristics‐curve (AUC) from 10‐fold cross‐validation of 0.72 was noted, compared to an average AUC of 0.55 for a baseline classifier (p = .0048). The classifier achieved an AUC of 0.76 on a previously unused testing‐set. We also performed external validation using DLPFC gene expression values from an independent cohort of matched MDD cases (n = 29) and controls (n = 29), obtained from Affymetrix microarray (vs. Illumina microarray for the original cohort) (AUC: 0.62). We highlighted gene sets differentially expressed in MDD that were enriched for genes identified by the ML algorithm. Next, we assessed the ML classification performance in blood‐based microarray gene expression data from MDD cases (n = 1,581) and controls (n = 369). We observed a mean AUC of 0.64 on 10‐fold cross‐validation, which was significantly above baseline (p = .0020). Similar performance was observed on the testing‐set (AUC: 0.61). Finally, we analyzed the classification performance in covariates subgroups. We identified an interesting interaction between smoking and recall performance in MDD case prediction (58% accurate predictions in cases who are smokers vs. 43% accurate predictions in cases who are non‐smokers). Overall, our results suggest that ML in combination with gene expression data and covariates could further our understanding of the pathophysiology in MDD.</description><subject>bioinformatics</subject><subject>DNA microarrays</subject><subject>Gene expression</subject><subject>Genetics</subject><subject>Learning algorithms</subject><subject>Machine learning</subject><subject>major depression</subject><subject>Mental depression</subject><subject>Predictions</subject><subject>Prefrontal cortex</subject><subject>transcriptomics</subject><issn>1552-4841</issn><issn>1552-485X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp90btKBDEUBuAgivfOWgI2Fu6aTBJnYrcuXvEComAXchvNMjNZkx1lO30G39AnMe6sFhZWCcnHz-H8AGxh1McIZftyVD_2VZ9kBeELYBUzlvVowR4Wf-8Ur4C1GEcIEcTyfBmsEHJAGUfFKni_kvrJNRZWVobGNY9QNgYq511T-lDLidPpRVbT6CL0JVRBuqYzlfcG1rfXAzgOvnSVjTB91XLkAzR2HGyM7sVC46IPxoZDOIBaRvv59qF9Mwm-gnHSmukGWCplFe3m_FwH9yfHd8Oz3uXN6flwcNnTaVjeK1GOFckZUVwjjAhlCvO8pIgoUpCcI8kIN9YqjQpMMGWaIcMokSXCxUGRkXWw2-WmaZ9bGyeidlHbqpKN9W0UGeW0yBOkie78oSPfhrSFmeIZZSk_qb1O6eBjDLYU4-BqGaYCI_FdjfiuRigxqybx7Xloq2prfvFPFwnQDrymXU7_DRODi6vToy73C3xkmzs</recordid><startdate>202103</startdate><enddate>202103</enddate><creator>Qi, Bill</creator><creator>Ramamurthy, Janani</creator><creator>Bennani, Imane</creator><creator>Trakadis, Yannis J.</creator><general>John Wiley & Sons, Inc</general><general>Wiley Subscription Services, Inc</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7TK</scope><scope>8FD</scope><scope>FR3</scope><scope>K9.</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-8740-4416</orcidid></search><sort><creationdate>202103</creationdate><title>Machine learning and bioinformatic analysis of brain and blood mRNA profiles in major depressive disorder: A case–control study</title><author>Qi, Bill ; Ramamurthy, Janani ; Bennani, Imane ; Trakadis, Yannis J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3649-f071b3753b9c010345b197f403b383790a539deebc0813145c50d543af0186823</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>bioinformatics</topic><topic>DNA microarrays</topic><topic>Gene expression</topic><topic>Genetics</topic><topic>Learning algorithms</topic><topic>Machine learning</topic><topic>major depression</topic><topic>Mental depression</topic><topic>Predictions</topic><topic>Prefrontal cortex</topic><topic>transcriptomics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Qi, Bill</creatorcontrib><creatorcontrib>Ramamurthy, Janani</creatorcontrib><creatorcontrib>Bennani, Imane</creatorcontrib><creatorcontrib>Trakadis, Yannis J.</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Neurosciences Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>American journal of medical genetics. Part B, Neuropsychiatric genetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Qi, Bill</au><au>Ramamurthy, Janani</au><au>Bennani, Imane</au><au>Trakadis, Yannis J.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Machine learning and bioinformatic analysis of brain and blood mRNA profiles in major depressive disorder: A case–control study</atitle><jtitle>American journal of medical genetics. Part B, Neuropsychiatric genetics</jtitle><addtitle>Am J Med Genet B Neuropsychiatr Genet</addtitle><date>2021-03</date><risdate>2021</risdate><volume>186</volume><issue>2</issue><spage>101</spage><epage>112</epage><pages>101-112</pages><issn>1552-4841</issn><eissn>1552-485X</eissn><abstract>This study analyzed gene expression messenger RNA data, from cases with major depressive disorder (MDD) and controls, using supervised machine learning (ML). We built on the methodology of prior studies to obtain more generalizable/reproducible results. First, we obtained a classifier trained on gene expression data from the dorsolateral prefrontal cortex of post‐mortem MDD cases (n = 126) and controls (n = 103). An average area‐under‐the‐receiver‐operating‐characteristics‐curve (AUC) from 10‐fold cross‐validation of 0.72 was noted, compared to an average AUC of 0.55 for a baseline classifier (p = .0048). The classifier achieved an AUC of 0.76 on a previously unused testing‐set. We also performed external validation using DLPFC gene expression values from an independent cohort of matched MDD cases (n = 29) and controls (n = 29), obtained from Affymetrix microarray (vs. Illumina microarray for the original cohort) (AUC: 0.62). We highlighted gene sets differentially expressed in MDD that were enriched for genes identified by the ML algorithm. Next, we assessed the ML classification performance in blood‐based microarray gene expression data from MDD cases (n = 1,581) and controls (n = 369). We observed a mean AUC of 0.64 on 10‐fold cross‐validation, which was significantly above baseline (p = .0020). Similar performance was observed on the testing‐set (AUC: 0.61). Finally, we analyzed the classification performance in covariates subgroups. We identified an interesting interaction between smoking and recall performance in MDD case prediction (58% accurate predictions in cases who are smokers vs. 43% accurate predictions in cases who are non‐smokers). Overall, our results suggest that ML in combination with gene expression data and covariates could further our understanding of the pathophysiology in MDD.</abstract><cop>Hoboken, USA</cop><pub>John Wiley & Sons, Inc</pub><pmid>33645908</pmid><doi>10.1002/ajmg.b.32839</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0002-8740-4416</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1552-4841 |
ispartof | American journal of medical genetics. Part B, Neuropsychiatric genetics, 2021-03, Vol.186 (2), p.101-112 |
issn | 1552-4841 1552-485X |
language | eng |
recordid | cdi_proquest_miscellaneous_2494878234 |
source | Wiley-Blackwell Read & Publish Collection |
subjects | bioinformatics DNA microarrays Gene expression Genetics Learning algorithms Machine learning major depression Mental depression Predictions Prefrontal cortex transcriptomics |
title | Machine learning and bioinformatic analysis of brain and blood mRNA profiles in major depressive disorder: A case–control study |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T14%3A54%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Machine%20learning%20and%20bioinformatic%20analysis%20of%20brain%20and%20blood%20mRNA%20profiles%20in%20major%20depressive%20disorder:%20A%20case%E2%80%93control%20study&rft.jtitle=American%20journal%20of%20medical%20genetics.%20Part%20B,%20Neuropsychiatric%20genetics&rft.au=Qi,%20Bill&rft.date=2021-03&rft.volume=186&rft.issue=2&rft.spage=101&rft.epage=112&rft.pages=101-112&rft.issn=1552-4841&rft.eissn=1552-485X&rft_id=info:doi/10.1002/ajmg.b.32839&rft_dat=%3Cproquest_cross%3E2494878234%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c3649-f071b3753b9c010345b197f403b383790a539deebc0813145c50d543af0186823%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2499245145&rft_id=info:pmid/33645908&rfr_iscdi=true |