Loading…
k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification
In this paper, a computational method based on machine learning technique for identifying Alzheimer's disease genes is proposed. Compared with most existing machine learning based methods, existing methods predict Alzheimer's disease genes by using structural magnetic resonance imaging (MR...
Saved in:
Published in: | Frontiers in genetics 2019-02, Vol.10, p.33 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c462t-412332cf3de295fbfd0db42bc3e8c57ec8e922bd131ec8f21f447c9508cbc1773 |
---|---|
cites | cdi_FETCH-LOGICAL-c462t-412332cf3de295fbfd0db42bc3e8c57ec8e922bd131ec8f21f447c9508cbc1773 |
container_end_page | |
container_issue | |
container_start_page | 33 |
container_title | Frontiers in genetics |
container_volume | 10 |
creator | Xu, Lei Liang, Guangmin Liao, Changrui Chen, Gin-Den Chang, Chi-Chang |
description | In this paper, a computational method based on machine learning technique for identifying Alzheimer's disease genes is proposed. Compared with most existing machine learning based methods, existing methods predict Alzheimer's disease genes by using structural magnetic resonance imaging (MRI) technique. Most methods have attained acceptable results, but the cost is expensive and time consuming. Thus, we proposed a computational method for identifying Alzheimer disease genes by use of the sequence information of proteins, and classify the feature vectors by random forest. In the proposed method, the gene protein information is extracted by adaptive k-skip-n-gram features. The proposed method can attain the accuracy to 85.5% on the selected UniProt dataset, which has been demonstrated by the experimental results. |
doi_str_mv | 10.3389/fgene.2019.00033 |
format | article |
fullrecord | <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_ddcfeadb7b014ec49d1b132d4527aa0e</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_ddcfeadb7b014ec49d1b132d4527aa0e</doaj_id><sourcerecordid>2186625517</sourcerecordid><originalsourceid>FETCH-LOGICAL-c462t-412332cf3de295fbfd0db42bc3e8c57ec8e922bd131ec8f21f447c9508cbc1773</originalsourceid><addsrcrecordid>eNpVkUtPGzEQgK2KqiDKvafKN7hs8Gsf7gEpUEIjUbWi7bGy_Bgnht11am-Q4NezSSiCk0eemW_G_hD6RMmE80ae-gX0MGGEygkhhPN36IBWlSgawujeq3gfHeV8O5YQITnn4gPa56Qhkgl2gP7eFb_uwqroi6uku-Jm9gVP8Y3uXezwLCbIAz7XGRz-DsMyOuxjwtP2cQmhg3Sc8deQYczjnykOEHo8d9APwQerhxD7j-i9122Go-fzEP2ZXf6--FZc_7iaX0yvCysqNhSCMs6Z9dwBk6U33hFnBDOWQ2PLGmwDkjHjKKdj7Bn1QtRWlqSxxtK65odovuO6qG_VKoVOpwcVdVDbi5gWSqch2BaUc9aDdqY2hAqwQjpqKGdOlKzWmsDIOtuxVmvTgbPje5Ju30DfZvqwVIt4rypeS1HSEXDyDEjx33r8QdWFbKFtdQ9xnRWjTVWxsqSbvcmu1KaYcwL_MoYStZGstpLVRrLaSh5bPr9e76Xhv1L-BBOhpCM</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2186625517</pqid></control><display><type>article</type><title>k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification</title><source>Open Access: PubMed Central</source><creator>Xu, Lei ; Liang, Guangmin ; Liao, Changrui ; Chen, Gin-Den ; Chang, Chi-Chang</creator><creatorcontrib>Xu, Lei ; Liang, Guangmin ; Liao, Changrui ; Chen, Gin-Den ; Chang, Chi-Chang</creatorcontrib><description>In this paper, a computational method based on machine learning technique for identifying Alzheimer's disease genes is proposed. Compared with most existing machine learning based methods, existing methods predict Alzheimer's disease genes by using structural magnetic resonance imaging (MRI) technique. Most methods have attained acceptable results, but the cost is expensive and time consuming. Thus, we proposed a computational method for identifying Alzheimer disease genes by use of the sequence information of proteins, and classify the feature vectors by random forest. In the proposed method, the gene protein information is extracted by adaptive k-skip-n-gram features. The proposed method can attain the accuracy to 85.5% on the selected UniProt dataset, which has been demonstrated by the experimental results.</description><identifier>ISSN: 1664-8021</identifier><identifier>EISSN: 1664-8021</identifier><identifier>DOI: 10.3389/fgene.2019.00033</identifier><identifier>PMID: 30809242</identifier><language>eng</language><publisher>Switzerland: Frontiers Media S.A</publisher><subject>Alzheimer's disease ; gene coding ; Genetics ; n-gram model ; random forest ; sequence information</subject><ispartof>Frontiers in genetics, 2019-02, Vol.10, p.33</ispartof><rights>Copyright © 2019 Xu, Liang, Liao, Chen and Chang. 2019 Xu, Liang, Liao, Chen and Chang</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c462t-412332cf3de295fbfd0db42bc3e8c57ec8e922bd131ec8f21f447c9508cbc1773</citedby><cites>FETCH-LOGICAL-c462t-412332cf3de295fbfd0db42bc3e8c57ec8e922bd131ec8f21f447c9508cbc1773</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6379451/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6379451/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/30809242$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Xu, Lei</creatorcontrib><creatorcontrib>Liang, Guangmin</creatorcontrib><creatorcontrib>Liao, Changrui</creatorcontrib><creatorcontrib>Chen, Gin-Den</creatorcontrib><creatorcontrib>Chang, Chi-Chang</creatorcontrib><title>k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification</title><title>Frontiers in genetics</title><addtitle>Front Genet</addtitle><description>In this paper, a computational method based on machine learning technique for identifying Alzheimer's disease genes is proposed. Compared with most existing machine learning based methods, existing methods predict Alzheimer's disease genes by using structural magnetic resonance imaging (MRI) technique. Most methods have attained acceptable results, but the cost is expensive and time consuming. Thus, we proposed a computational method for identifying Alzheimer disease genes by use of the sequence information of proteins, and classify the feature vectors by random forest. In the proposed method, the gene protein information is extracted by adaptive k-skip-n-gram features. The proposed method can attain the accuracy to 85.5% on the selected UniProt dataset, which has been demonstrated by the experimental results.</description><subject>Alzheimer's disease</subject><subject>gene coding</subject><subject>Genetics</subject><subject>n-gram model</subject><subject>random forest</subject><subject>sequence information</subject><issn>1664-8021</issn><issn>1664-8021</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>DOA</sourceid><recordid>eNpVkUtPGzEQgK2KqiDKvafKN7hs8Gsf7gEpUEIjUbWi7bGy_Bgnht11am-Q4NezSSiCk0eemW_G_hD6RMmE80ae-gX0MGGEygkhhPN36IBWlSgawujeq3gfHeV8O5YQITnn4gPa56Qhkgl2gP7eFb_uwqroi6uku-Jm9gVP8Y3uXezwLCbIAz7XGRz-DsMyOuxjwtP2cQmhg3Sc8deQYczjnykOEHo8d9APwQerhxD7j-i9122Go-fzEP2ZXf6--FZc_7iaX0yvCysqNhSCMs6Z9dwBk6U33hFnBDOWQ2PLGmwDkjHjKKdj7Bn1QtRWlqSxxtK65odovuO6qG_VKoVOpwcVdVDbi5gWSqch2BaUc9aDdqY2hAqwQjpqKGdOlKzWmsDIOtuxVmvTgbPje5Ju30DfZvqwVIt4rypeS1HSEXDyDEjx33r8QdWFbKFtdQ9xnRWjTVWxsqSbvcmu1KaYcwL_MoYStZGstpLVRrLaSh5bPr9e76Xhv1L-BBOhpCM</recordid><startdate>20190212</startdate><enddate>20190212</enddate><creator>Xu, Lei</creator><creator>Liang, Guangmin</creator><creator>Liao, Changrui</creator><creator>Chen, Gin-Den</creator><creator>Chang, Chi-Chang</creator><general>Frontiers Media S.A</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20190212</creationdate><title>k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification</title><author>Xu, Lei ; Liang, Guangmin ; Liao, Changrui ; Chen, Gin-Den ; Chang, Chi-Chang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c462t-412332cf3de295fbfd0db42bc3e8c57ec8e922bd131ec8f21f447c9508cbc1773</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Alzheimer's disease</topic><topic>gene coding</topic><topic>Genetics</topic><topic>n-gram model</topic><topic>random forest</topic><topic>sequence information</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Xu, Lei</creatorcontrib><creatorcontrib>Liang, Guangmin</creatorcontrib><creatorcontrib>Liao, Changrui</creatorcontrib><creatorcontrib>Chen, Gin-Den</creatorcontrib><creatorcontrib>Chang, Chi-Chang</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>Frontiers in genetics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Xu, Lei</au><au>Liang, Guangmin</au><au>Liao, Changrui</au><au>Chen, Gin-Den</au><au>Chang, Chi-Chang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification</atitle><jtitle>Frontiers in genetics</jtitle><addtitle>Front Genet</addtitle><date>2019-02-12</date><risdate>2019</risdate><volume>10</volume><spage>33</spage><pages>33-</pages><issn>1664-8021</issn><eissn>1664-8021</eissn><abstract>In this paper, a computational method based on machine learning technique for identifying Alzheimer's disease genes is proposed. Compared with most existing machine learning based methods, existing methods predict Alzheimer's disease genes by using structural magnetic resonance imaging (MRI) technique. Most methods have attained acceptable results, but the cost is expensive and time consuming. Thus, we proposed a computational method for identifying Alzheimer disease genes by use of the sequence information of proteins, and classify the feature vectors by random forest. In the proposed method, the gene protein information is extracted by adaptive k-skip-n-gram features. The proposed method can attain the accuracy to 85.5% on the selected UniProt dataset, which has been demonstrated by the experimental results.</abstract><cop>Switzerland</cop><pub>Frontiers Media S.A</pub><pmid>30809242</pmid><doi>10.3389/fgene.2019.00033</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1664-8021 |
ispartof | Frontiers in genetics, 2019-02, Vol.10, p.33 |
issn | 1664-8021 1664-8021 |
language | eng |
recordid | cdi_doaj_primary_oai_doaj_org_article_ddcfeadb7b014ec49d1b132d4527aa0e |
source | Open Access: PubMed Central |
subjects | Alzheimer's disease gene coding Genetics n-gram model random forest sequence information |
title | k-Skip-n-Gram-RF: A Random Forest Based Method for Alzheimer's Disease Protein Identification |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T03%3A02%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=k-Skip-n-Gram-RF:%20A%20Random%20Forest%20Based%20Method%20for%20Alzheimer's%20Disease%20Protein%20Identification&rft.jtitle=Frontiers%20in%20genetics&rft.au=Xu,%20Lei&rft.date=2019-02-12&rft.volume=10&rft.spage=33&rft.pages=33-&rft.issn=1664-8021&rft.eissn=1664-8021&rft_id=info:doi/10.3389/fgene.2019.00033&rft_dat=%3Cproquest_doaj_%3E2186625517%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c462t-412332cf3de295fbfd0db42bc3e8c57ec8e922bd131ec8f21f447c9508cbc1773%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2186625517&rft_id=info:pmid/30809242&rfr_iscdi=true |