Loading…
Prediction of heart disease and classifiers’ sensitivity analysis
Heart disease (HD) is one of the most common diseases nowadays, and an early diagnosis of such a disease is a crucial task for many health care providers to prevent their patients for such a disease and to save lives. In this paper, a comparative analysis of different classifiers was performed for t...
Saved in:
Published in: | BMC bioinformatics 2020-07, Vol.21 (1), p.1-278, Article 278 |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c574t-794e3ecf67e38f015acd170830790cbff84e264362b5e33ae08124bd53f135983 |
---|---|
cites | cdi_FETCH-LOGICAL-c574t-794e3ecf67e38f015acd170830790cbff84e264362b5e33ae08124bd53f135983 |
container_end_page | 278 |
container_issue | 1 |
container_start_page | 1 |
container_title | BMC bioinformatics |
container_volume | 21 |
creator | Almustafa, Khaled Mohamad |
description | Heart disease (HD) is one of the most common diseases nowadays, and an early diagnosis of such a disease is a crucial task for many health care providers to prevent their patients for such a disease and to save lives. In this paper, a comparative analysis of different classifiers was performed for the classification of the Heart Disease dataset in order to correctly classify and or predict HD cases with minimal attributes. The set contains 76 attributes including the class attribute, for 1025 patients collected from Cleveland, Hungary, Switzerland, and Long Beach, but in this paper, only a subset of 14 attributes are used, and each attribute has a given set value. The algorithms used K- Nearest Neighbor (K-NN), Naive Bayes, Decision tree J48, JRip, SVM, Adaboost, Stochastic Gradient Decent (SGD) and Decision Table (DT) classifiers to show the performance of the selected classifications algorithms to best classify, and or predict, the HD cases. It was shown that using different classification algorithms for the classification of the HD dataset gives very promising results in term of the classification accuracy for the K-NN (K = 1), Decision tree J48 and JRip classifiers with accuracy of classification of 99.7073, 98.0488 and 97.2683% respectively. A feature extraction method was performed using Classifier Subset Evaluator on the HD dataset, and results show enhanced performance in term of the classification accuracy for K-NN (N = 1) and Decision Table classifiers to 100 and 93.8537% respectively after using the selected features by only applying a combination of up to 4 attributes instead of 13 attributes for the predication of the HD cases. Different classifiers were used and compared to classify the HD dataset, and we concluded the benefit of having a reliable feature selection method for HD disease prediction with using minimal number of attributes instead of having to consider all available ones. |
doi_str_mv | 10.1186/s12859-020-03626-y |
format | article |
fullrecord | <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_ad89b2b6950940888a9091042eef5d71</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A628642134</galeid><doaj_id>oai_doaj_org_article_ad89b2b6950940888a9091042eef5d71</doaj_id><sourcerecordid>A628642134</sourcerecordid><originalsourceid>FETCH-LOGICAL-c574t-794e3ecf67e38f015acd170830790cbff84e264362b5e33ae08124bd53f135983</originalsourceid><addsrcrecordid>eNptks1u1DAQxyMEoqXwApwicYFDyvgrcS5I1YqPlSqB-DhbjjPeepWNiyepyI3X4PV4EtzdCliEfLA18_PfM-N_UTxlcM6Yrl8S41q1FXCoQNS8rpZ7xSmTDas4A3X_r_NJ8YhoC8AaDephcSJ4zVSr4bRYfUjYBzeFOJbRl1do01T2gdASlnbsSzdYouADJvr5_UdJOFKYwk2Ylpy2w0KBHhcPvB0In9ztZ8WXN68_r95Vl-_frlcXl5VTjZyqppUo0Pm6QaE9MGVdzxrQApoWXOe9lshrmTvpFAphETTjsuuV8EzkasVZsT7o9tFuzXUKO5sWE20w-0BMG5OrD25AY3vddryrWwWtBK21baFlIDmiV33Dstarg9b13O2wdzhOyQ5HoseZMVyZTbwxjRCMC5EFnt8JpPh1RprMLpDDYbAjxpkMlzy3KJhUGX32D7qNc8rD21P5j7gC_Yfa2NxAGH3M77pbUXNRc11LzoTM1Pl_qLx63AUXR_Qhx48uvDi6kJkJv00bOxOZ9aePxyw_sC5FooT-9zwYmFvLmYPlTLac2VvOLOIXf4jFeQ</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2424712508</pqid></control><display><type>article</type><title>Prediction of heart disease and classifiers’ sensitivity analysis</title><source>Publicly Available Content Database</source><source>PubMed Central</source><creator>Almustafa, Khaled Mohamad</creator><creatorcontrib>Almustafa, Khaled Mohamad</creatorcontrib><description>Heart disease (HD) is one of the most common diseases nowadays, and an early diagnosis of such a disease is a crucial task for many health care providers to prevent their patients for such a disease and to save lives. In this paper, a comparative analysis of different classifiers was performed for the classification of the Heart Disease dataset in order to correctly classify and or predict HD cases with minimal attributes. The set contains 76 attributes including the class attribute, for 1025 patients collected from Cleveland, Hungary, Switzerland, and Long Beach, but in this paper, only a subset of 14 attributes are used, and each attribute has a given set value. The algorithms used K- Nearest Neighbor (K-NN), Naive Bayes, Decision tree J48, JRip, SVM, Adaboost, Stochastic Gradient Decent (SGD) and Decision Table (DT) classifiers to show the performance of the selected classifications algorithms to best classify, and or predict, the HD cases. It was shown that using different classification algorithms for the classification of the HD dataset gives very promising results in term of the classification accuracy for the K-NN (K = 1), Decision tree J48 and JRip classifiers with accuracy of classification of 99.7073, 98.0488 and 97.2683% respectively. A feature extraction method was performed using Classifier Subset Evaluator on the HD dataset, and results show enhanced performance in term of the classification accuracy for K-NN (N = 1) and Decision Table classifiers to 100 and 93.8537% respectively after using the selected features by only applying a combination of up to 4 attributes instead of 13 attributes for the predication of the HD cases. Different classifiers were used and compared to classify the HD dataset, and we concluded the benefit of having a reliable feature selection method for HD disease prediction with using minimal number of attributes instead of having to consider all available ones.</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/s12859-020-03626-y</identifier><identifier>PMID: 32615980</identifier><language>eng</language><publisher>London: BioMed Central Ltd</publisher><subject>Algorithms ; Bayesian analysis ; Cardiovascular disease ; Cardiovascular diseases ; Classification ; Classifiers ; Comparative analysis ; Coronary artery disease ; Datasets ; Decision tree J48 ; Decision trees ; Feature extraction ; Health care industry ; Heart disease (HD) ; Heart diseases ; High definition television ; K-nearest neighbor ; Machine learning ; Methodology ; Neural networks ; Performance enhancement ; Prediction ; Sensitivity analysis ; Stochasticity ; Support vector machine (SVM) ; Support vector machines</subject><ispartof>BMC bioinformatics, 2020-07, Vol.21 (1), p.1-278, Article 278</ispartof><rights>COPYRIGHT 2020 BioMed Central Ltd.</rights><rights>2020. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>The Author(s) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c574t-794e3ecf67e38f015acd170830790cbff84e264362b5e33ae08124bd53f135983</citedby><cites>FETCH-LOGICAL-c574t-794e3ecf67e38f015acd170830790cbff84e264362b5e33ae08124bd53f135983</cites><orcidid>0000-0003-2129-7686</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7331233/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2424712508?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,25753,27924,27925,37012,37013,44590,53791,53793</link.rule.ids></links><search><creatorcontrib>Almustafa, Khaled Mohamad</creatorcontrib><title>Prediction of heart disease and classifiers’ sensitivity analysis</title><title>BMC bioinformatics</title><description>Heart disease (HD) is one of the most common diseases nowadays, and an early diagnosis of such a disease is a crucial task for many health care providers to prevent their patients for such a disease and to save lives. In this paper, a comparative analysis of different classifiers was performed for the classification of the Heart Disease dataset in order to correctly classify and or predict HD cases with minimal attributes. The set contains 76 attributes including the class attribute, for 1025 patients collected from Cleveland, Hungary, Switzerland, and Long Beach, but in this paper, only a subset of 14 attributes are used, and each attribute has a given set value. The algorithms used K- Nearest Neighbor (K-NN), Naive Bayes, Decision tree J48, JRip, SVM, Adaboost, Stochastic Gradient Decent (SGD) and Decision Table (DT) classifiers to show the performance of the selected classifications algorithms to best classify, and or predict, the HD cases. It was shown that using different classification algorithms for the classification of the HD dataset gives very promising results in term of the classification accuracy for the K-NN (K = 1), Decision tree J48 and JRip classifiers with accuracy of classification of 99.7073, 98.0488 and 97.2683% respectively. A feature extraction method was performed using Classifier Subset Evaluator on the HD dataset, and results show enhanced performance in term of the classification accuracy for K-NN (N = 1) and Decision Table classifiers to 100 and 93.8537% respectively after using the selected features by only applying a combination of up to 4 attributes instead of 13 attributes for the predication of the HD cases. Different classifiers were used and compared to classify the HD dataset, and we concluded the benefit of having a reliable feature selection method for HD disease prediction with using minimal number of attributes instead of having to consider all available ones.</description><subject>Algorithms</subject><subject>Bayesian analysis</subject><subject>Cardiovascular disease</subject><subject>Cardiovascular diseases</subject><subject>Classification</subject><subject>Classifiers</subject><subject>Comparative analysis</subject><subject>Coronary artery disease</subject><subject>Datasets</subject><subject>Decision tree J48</subject><subject>Decision trees</subject><subject>Feature extraction</subject><subject>Health care industry</subject><subject>Heart disease (HD)</subject><subject>Heart diseases</subject><subject>High definition television</subject><subject>K-nearest neighbor</subject><subject>Machine learning</subject><subject>Methodology</subject><subject>Neural networks</subject><subject>Performance enhancement</subject><subject>Prediction</subject><subject>Sensitivity analysis</subject><subject>Stochasticity</subject><subject>Support vector machine (SVM)</subject><subject>Support vector machines</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNptks1u1DAQxyMEoqXwApwicYFDyvgrcS5I1YqPlSqB-DhbjjPeepWNiyepyI3X4PV4EtzdCliEfLA18_PfM-N_UTxlcM6Yrl8S41q1FXCoQNS8rpZ7xSmTDas4A3X_r_NJ8YhoC8AaDephcSJ4zVSr4bRYfUjYBzeFOJbRl1do01T2gdASlnbsSzdYouADJvr5_UdJOFKYwk2Ylpy2w0KBHhcPvB0In9ztZ8WXN68_r95Vl-_frlcXl5VTjZyqppUo0Pm6QaE9MGVdzxrQApoWXOe9lshrmTvpFAphETTjsuuV8EzkasVZsT7o9tFuzXUKO5sWE20w-0BMG5OrD25AY3vddryrWwWtBK21baFlIDmiV33Dstarg9b13O2wdzhOyQ5HoseZMVyZTbwxjRCMC5EFnt8JpPh1RprMLpDDYbAjxpkMlzy3KJhUGX32D7qNc8rD21P5j7gC_Yfa2NxAGH3M77pbUXNRc11LzoTM1Pl_qLx63AUXR_Qhx48uvDi6kJkJv00bOxOZ9aePxyw_sC5FooT-9zwYmFvLmYPlTLac2VvOLOIXf4jFeQ</recordid><startdate>20200702</startdate><enddate>20200702</enddate><creator>Almustafa, Khaled Mohamad</creator><general>BioMed Central Ltd</general><general>BioMed Central</general><general>BMC</general><scope>AAYXX</scope><scope>CITATION</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7SC</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0003-2129-7686</orcidid></search><sort><creationdate>20200702</creationdate><title>Prediction of heart disease and classifiers’ sensitivity analysis</title><author>Almustafa, Khaled Mohamad</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c574t-794e3ecf67e38f015acd170830790cbff84e264362b5e33ae08124bd53f135983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Bayesian analysis</topic><topic>Cardiovascular disease</topic><topic>Cardiovascular diseases</topic><topic>Classification</topic><topic>Classifiers</topic><topic>Comparative analysis</topic><topic>Coronary artery disease</topic><topic>Datasets</topic><topic>Decision tree J48</topic><topic>Decision trees</topic><topic>Feature extraction</topic><topic>Health care industry</topic><topic>Heart disease (HD)</topic><topic>Heart diseases</topic><topic>High definition television</topic><topic>K-nearest neighbor</topic><topic>Machine learning</topic><topic>Methodology</topic><topic>Neural networks</topic><topic>Performance enhancement</topic><topic>Prediction</topic><topic>Sensitivity analysis</topic><topic>Stochasticity</topic><topic>Support vector machine (SVM)</topic><topic>Support vector machines</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Almustafa, Khaled Mohamad</creatorcontrib><collection>CrossRef</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ProQuest Biological Science Collection</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Biological Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Almustafa, Khaled Mohamad</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Prediction of heart disease and classifiers’ sensitivity analysis</atitle><jtitle>BMC bioinformatics</jtitle><date>2020-07-02</date><risdate>2020</risdate><volume>21</volume><issue>1</issue><spage>1</spage><epage>278</epage><pages>1-278</pages><artnum>278</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>Heart disease (HD) is one of the most common diseases nowadays, and an early diagnosis of such a disease is a crucial task for many health care providers to prevent their patients for such a disease and to save lives. In this paper, a comparative analysis of different classifiers was performed for the classification of the Heart Disease dataset in order to correctly classify and or predict HD cases with minimal attributes. The set contains 76 attributes including the class attribute, for 1025 patients collected from Cleveland, Hungary, Switzerland, and Long Beach, but in this paper, only a subset of 14 attributes are used, and each attribute has a given set value. The algorithms used K- Nearest Neighbor (K-NN), Naive Bayes, Decision tree J48, JRip, SVM, Adaboost, Stochastic Gradient Decent (SGD) and Decision Table (DT) classifiers to show the performance of the selected classifications algorithms to best classify, and or predict, the HD cases. It was shown that using different classification algorithms for the classification of the HD dataset gives very promising results in term of the classification accuracy for the K-NN (K = 1), Decision tree J48 and JRip classifiers with accuracy of classification of 99.7073, 98.0488 and 97.2683% respectively. A feature extraction method was performed using Classifier Subset Evaluator on the HD dataset, and results show enhanced performance in term of the classification accuracy for K-NN (N = 1) and Decision Table classifiers to 100 and 93.8537% respectively after using the selected features by only applying a combination of up to 4 attributes instead of 13 attributes for the predication of the HD cases. Different classifiers were used and compared to classify the HD dataset, and we concluded the benefit of having a reliable feature selection method for HD disease prediction with using minimal number of attributes instead of having to consider all available ones.</abstract><cop>London</cop><pub>BioMed Central Ltd</pub><pmid>32615980</pmid><doi>10.1186/s12859-020-03626-y</doi><orcidid>https://orcid.org/0000-0003-2129-7686</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1471-2105 |
ispartof | BMC bioinformatics, 2020-07, Vol.21 (1), p.1-278, Article 278 |
issn | 1471-2105 1471-2105 |
language | eng |
recordid | cdi_doaj_primary_oai_doaj_org_article_ad89b2b6950940888a9091042eef5d71 |
source | Publicly Available Content Database; PubMed Central |
subjects | Algorithms Bayesian analysis Cardiovascular disease Cardiovascular diseases Classification Classifiers Comparative analysis Coronary artery disease Datasets Decision tree J48 Decision trees Feature extraction Health care industry Heart disease (HD) Heart diseases High definition television K-nearest neighbor Machine learning Methodology Neural networks Performance enhancement Prediction Sensitivity analysis Stochasticity Support vector machine (SVM) Support vector machines |
title | Prediction of heart disease and classifiers’ sensitivity analysis |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T15%3A32%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Prediction%20of%20heart%20disease%20and%20classifiers%E2%80%99%20sensitivity%20analysis&rft.jtitle=BMC%20bioinformatics&rft.au=Almustafa,%20Khaled%20Mohamad&rft.date=2020-07-02&rft.volume=21&rft.issue=1&rft.spage=1&rft.epage=278&rft.pages=1-278&rft.artnum=278&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/s12859-020-03626-y&rft_dat=%3Cgale_doaj_%3EA628642134%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c574t-794e3ecf67e38f015acd170830790cbff84e264362b5e33ae08124bd53f135983%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2424712508&rft_id=info:pmid/32615980&rft_galeid=A628642134&rfr_iscdi=true |