Loading…
An Optimized Approach for Predicting Water Quality Features Based on Machine Learning
Traditionally, water quality is assessed using costly laboratory and statistical methods, rendering real-time monitoring useless. Poor water quality requires a more practical and cost-effective solution. The machine learning classification approach appears promising for rapid detection and predictio...
Saved in:
Published in: | Wireless communications and mobile computing 2022-09, Vol.2022, p.1-20 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c337t-a28010a95a7c048d975893fbe3a3fd16169f8f7d4f3a28fc63f8a61b3d56d3493 |
---|---|
cites | cdi_FETCH-LOGICAL-c337t-a28010a95a7c048d975893fbe3a3fd16169f8f7d4f3a28fc63f8a61b3d56d3493 |
container_end_page | 20 |
container_issue | |
container_start_page | 1 |
container_title | Wireless communications and mobile computing |
container_volume | 2022 |
creator | Suwadi, Nur Afyfah Derbali, Morched Sani, Nor Samsiah Lam, Meng Chun Arshad, Haslina Khan, Imran Kim, Ki-Il |
description | Traditionally, water quality is assessed using costly laboratory and statistical methods, rendering real-time monitoring useless. Poor water quality requires a more practical and cost-effective solution. The machine learning classification approach appears promising for rapid detection and prediction of water quality. Machine learning has been used successfully to predict water quality. However, research on machine learning for water quality index (WQI) prediction is generally lacking. Therefore, this research aims to identify the important features for the WQI, which necessitated the classification of numerous indicators. This study develops four machine learning models (Artificial Neural Network, Support Vector Machine, Random Forest, and Naïve Bayes) based on the WQI and chemical parameters. The Langat Basin in Selangor dataset from the Department of Environment of Malaysia trains and validates each machine learning model. Several data preprocessing tasks such as data cleaning and feature selection have been conducted on the raw dataset to ensure the quality of the training data. The performance of these machine learning algorithms is further rectified based on the selected features set by several feature selection strategies such as information gain, correlation, and symmetrical uncertainty. Each classifier is then optimized using different tuning parameters to achieve optimum values before comparing the output of the three classifiers against each other. The observational results have shown that the optimized Random Forest classifier with the WQI parameter selected by the information gain feature selection method achieved the highest performance. The experimental results show that the WQI parameters are more relevant in predicting the WQI than the other variables. Consequently, this result shows that parameter oxygen (DO) and biochemical oxygen demand (BOD) are important features for predicting WQI. The proposed model achieved reasonable accuracy with minimal parameters, indicating that it could be used in real-time water quality detection systems. |
doi_str_mv | 10.1155/2022/3397972 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2715338421</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2715338421</sourcerecordid><originalsourceid>FETCH-LOGICAL-c337t-a28010a95a7c048d975893fbe3a3fd16169f8f7d4f3a28fc63f8a61b3d56d3493</originalsourceid><addsrcrecordid>eNp9kEFLAzEQhYMoWKs3f0DAo65NMrub5FiLVaFSBYvHJd0kmtLurkkWqb_elBaPnmYO33sz7yF0ScktpUUxYoSxEYDkkrMjNKAFkEyUnB__7aU8RWchrAghQBgdoMW4wfMuuo37MRqPu863qv7EtvX4xRvt6uiaD_yuovH4tVdrF7d4alTsvQn4ToUkahv8nDSuMXhmlG-S4BydWLUO5uIwh2gxvX-bPGaz-cPTZDzLagAeM8UEoUTJQvGa5EJLXggJdmlAgdW0pKW0wnKdW0iorUuwQpV0CbooNeQShuhq75ve_upNiNWq7X2TTlaMp8ggckYTdbOnat-G4I2tOu82ym8rSqpdcdWuuOpQXMKv93iKpNW3-5_-BXn2bA8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2715338421</pqid></control><display><type>article</type><title>An Optimized Approach for Predicting Water Quality Features Based on Machine Learning</title><source>Wiley Online Library Open Access</source><source>Publicly Available Content Database</source><creator>Suwadi, Nur Afyfah ; Derbali, Morched ; Sani, Nor Samsiah ; Lam, Meng Chun ; Arshad, Haslina ; Khan, Imran ; Kim, Ki-Il</creator><contributor>Shuja, Junaid ; Junaid Shuja</contributor><creatorcontrib>Suwadi, Nur Afyfah ; Derbali, Morched ; Sani, Nor Samsiah ; Lam, Meng Chun ; Arshad, Haslina ; Khan, Imran ; Kim, Ki-Il ; Shuja, Junaid ; Junaid Shuja</creatorcontrib><description>Traditionally, water quality is assessed using costly laboratory and statistical methods, rendering real-time monitoring useless. Poor water quality requires a more practical and cost-effective solution. The machine learning classification approach appears promising for rapid detection and prediction of water quality. Machine learning has been used successfully to predict water quality. However, research on machine learning for water quality index (WQI) prediction is generally lacking. Therefore, this research aims to identify the important features for the WQI, which necessitated the classification of numerous indicators. This study develops four machine learning models (Artificial Neural Network, Support Vector Machine, Random Forest, and Naïve Bayes) based on the WQI and chemical parameters. The Langat Basin in Selangor dataset from the Department of Environment of Malaysia trains and validates each machine learning model. Several data preprocessing tasks such as data cleaning and feature selection have been conducted on the raw dataset to ensure the quality of the training data. The performance of these machine learning algorithms is further rectified based on the selected features set by several feature selection strategies such as information gain, correlation, and symmetrical uncertainty. Each classifier is then optimized using different tuning parameters to achieve optimum values before comparing the output of the three classifiers against each other. The observational results have shown that the optimized Random Forest classifier with the WQI parameter selected by the information gain feature selection method achieved the highest performance. The experimental results show that the WQI parameters are more relevant in predicting the WQI than the other variables. Consequently, this result shows that parameter oxygen (DO) and biochemical oxygen demand (BOD) are important features for predicting WQI. The proposed model achieved reasonable accuracy with minimal parameters, indicating that it could be used in real-time water quality detection systems.</description><identifier>ISSN: 1530-8669</identifier><identifier>EISSN: 1530-8677</identifier><identifier>DOI: 10.1155/2022/3397972</identifier><language>eng</language><publisher>Oxford: Hindawi</publisher><subject>Accuracy ; Algorithms ; Artificial intelligence ; Artificial neural networks ; Biochemical oxygen demand ; Classification ; Classifiers ; Datasets ; Decision making ; Environmental impact ; Feature selection ; Groundwater ; International organizations ; Internet of Things ; Machine learning ; Mathematical models ; Methods ; Model accuracy ; Oxygen ; Parameters ; Pattern recognition systems ; Quality assessment ; Real time ; Statistical methods ; Support vector machines ; Water quality</subject><ispartof>Wireless communications and mobile computing, 2022-09, Vol.2022, p.1-20</ispartof><rights>Copyright © 2022 Nur Afyfah Suwadi et al.</rights><rights>Copyright © 2022 Nur Afyfah Suwadi et al. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c337t-a28010a95a7c048d975893fbe3a3fd16169f8f7d4f3a28fc63f8a61b3d56d3493</citedby><cites>FETCH-LOGICAL-c337t-a28010a95a7c048d975893fbe3a3fd16169f8f7d4f3a28fc63f8a61b3d56d3493</cites><orcidid>0000-0001-5802-5946 ; 0000-0003-3805-0532 ; 0000-0002-8366-3533</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2715338421/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2715338421?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,25753,27924,27925,37012,44590,75126</link.rule.ids></links><search><contributor>Shuja, Junaid</contributor><contributor>Junaid Shuja</contributor><creatorcontrib>Suwadi, Nur Afyfah</creatorcontrib><creatorcontrib>Derbali, Morched</creatorcontrib><creatorcontrib>Sani, Nor Samsiah</creatorcontrib><creatorcontrib>Lam, Meng Chun</creatorcontrib><creatorcontrib>Arshad, Haslina</creatorcontrib><creatorcontrib>Khan, Imran</creatorcontrib><creatorcontrib>Kim, Ki-Il</creatorcontrib><title>An Optimized Approach for Predicting Water Quality Features Based on Machine Learning</title><title>Wireless communications and mobile computing</title><description>Traditionally, water quality is assessed using costly laboratory and statistical methods, rendering real-time monitoring useless. Poor water quality requires a more practical and cost-effective solution. The machine learning classification approach appears promising for rapid detection and prediction of water quality. Machine learning has been used successfully to predict water quality. However, research on machine learning for water quality index (WQI) prediction is generally lacking. Therefore, this research aims to identify the important features for the WQI, which necessitated the classification of numerous indicators. This study develops four machine learning models (Artificial Neural Network, Support Vector Machine, Random Forest, and Naïve Bayes) based on the WQI and chemical parameters. The Langat Basin in Selangor dataset from the Department of Environment of Malaysia trains and validates each machine learning model. Several data preprocessing tasks such as data cleaning and feature selection have been conducted on the raw dataset to ensure the quality of the training data. The performance of these machine learning algorithms is further rectified based on the selected features set by several feature selection strategies such as information gain, correlation, and symmetrical uncertainty. Each classifier is then optimized using different tuning parameters to achieve optimum values before comparing the output of the three classifiers against each other. The observational results have shown that the optimized Random Forest classifier with the WQI parameter selected by the information gain feature selection method achieved the highest performance. The experimental results show that the WQI parameters are more relevant in predicting the WQI than the other variables. Consequently, this result shows that parameter oxygen (DO) and biochemical oxygen demand (BOD) are important features for predicting WQI. The proposed model achieved reasonable accuracy with minimal parameters, indicating that it could be used in real-time water quality detection systems.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial intelligence</subject><subject>Artificial neural networks</subject><subject>Biochemical oxygen demand</subject><subject>Classification</subject><subject>Classifiers</subject><subject>Datasets</subject><subject>Decision making</subject><subject>Environmental impact</subject><subject>Feature selection</subject><subject>Groundwater</subject><subject>International organizations</subject><subject>Internet of Things</subject><subject>Machine learning</subject><subject>Mathematical models</subject><subject>Methods</subject><subject>Model accuracy</subject><subject>Oxygen</subject><subject>Parameters</subject><subject>Pattern recognition systems</subject><subject>Quality assessment</subject><subject>Real time</subject><subject>Statistical methods</subject><subject>Support vector machines</subject><subject>Water quality</subject><issn>1530-8669</issn><issn>1530-8677</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNp9kEFLAzEQhYMoWKs3f0DAo65NMrub5FiLVaFSBYvHJd0kmtLurkkWqb_elBaPnmYO33sz7yF0ScktpUUxYoSxEYDkkrMjNKAFkEyUnB__7aU8RWchrAghQBgdoMW4wfMuuo37MRqPu863qv7EtvX4xRvt6uiaD_yuovH4tVdrF7d4alTsvQn4ToUkahv8nDSuMXhmlG-S4BydWLUO5uIwh2gxvX-bPGaz-cPTZDzLagAeM8UEoUTJQvGa5EJLXggJdmlAgdW0pKW0wnKdW0iorUuwQpV0CbooNeQShuhq75ve_upNiNWq7X2TTlaMp8ggckYTdbOnat-G4I2tOu82ym8rSqpdcdWuuOpQXMKv93iKpNW3-5_-BXn2bA8</recordid><startdate>20220909</startdate><enddate>20220909</enddate><creator>Suwadi, Nur Afyfah</creator><creator>Derbali, Morched</creator><creator>Sani, Nor Samsiah</creator><creator>Lam, Meng Chun</creator><creator>Arshad, Haslina</creator><creator>Khan, Imran</creator><creator>Kim, Ki-Il</creator><general>Hindawi</general><general>Hindawi Limited</general><scope>RHU</scope><scope>RHW</scope><scope>RHX</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7XB</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0001-5802-5946</orcidid><orcidid>https://orcid.org/0000-0003-3805-0532</orcidid><orcidid>https://orcid.org/0000-0002-8366-3533</orcidid></search><sort><creationdate>20220909</creationdate><title>An Optimized Approach for Predicting Water Quality Features Based on Machine Learning</title><author>Suwadi, Nur Afyfah ; Derbali, Morched ; Sani, Nor Samsiah ; Lam, Meng Chun ; Arshad, Haslina ; Khan, Imran ; Kim, Ki-Il</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c337t-a28010a95a7c048d975893fbe3a3fd16169f8f7d4f3a28fc63f8a61b3d56d3493</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial intelligence</topic><topic>Artificial neural networks</topic><topic>Biochemical oxygen demand</topic><topic>Classification</topic><topic>Classifiers</topic><topic>Datasets</topic><topic>Decision making</topic><topic>Environmental impact</topic><topic>Feature selection</topic><topic>Groundwater</topic><topic>International organizations</topic><topic>Internet of Things</topic><topic>Machine learning</topic><topic>Mathematical models</topic><topic>Methods</topic><topic>Model accuracy</topic><topic>Oxygen</topic><topic>Parameters</topic><topic>Pattern recognition systems</topic><topic>Quality assessment</topic><topic>Real time</topic><topic>Statistical methods</topic><topic>Support vector machines</topic><topic>Water quality</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Suwadi, Nur Afyfah</creatorcontrib><creatorcontrib>Derbali, Morched</creatorcontrib><creatorcontrib>Sani, Nor Samsiah</creatorcontrib><creatorcontrib>Lam, Meng Chun</creatorcontrib><creatorcontrib>Arshad, Haslina</creatorcontrib><creatorcontrib>Khan, Imran</creatorcontrib><creatorcontrib>Kim, Ki-Il</creatorcontrib><collection>Hindawi Publishing Complete</collection><collection>Hindawi Publishing Subscription Journals</collection><collection>Hindawi Publishing Open Access</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer science database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>ProQuest advanced technologies & aerospace journals</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><jtitle>Wireless communications and mobile computing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Suwadi, Nur Afyfah</au><au>Derbali, Morched</au><au>Sani, Nor Samsiah</au><au>Lam, Meng Chun</au><au>Arshad, Haslina</au><au>Khan, Imran</au><au>Kim, Ki-Il</au><au>Shuja, Junaid</au><au>Junaid Shuja</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An Optimized Approach for Predicting Water Quality Features Based on Machine Learning</atitle><jtitle>Wireless communications and mobile computing</jtitle><date>2022-09-09</date><risdate>2022</risdate><volume>2022</volume><spage>1</spage><epage>20</epage><pages>1-20</pages><issn>1530-8669</issn><eissn>1530-8677</eissn><abstract>Traditionally, water quality is assessed using costly laboratory and statistical methods, rendering real-time monitoring useless. Poor water quality requires a more practical and cost-effective solution. The machine learning classification approach appears promising for rapid detection and prediction of water quality. Machine learning has been used successfully to predict water quality. However, research on machine learning for water quality index (WQI) prediction is generally lacking. Therefore, this research aims to identify the important features for the WQI, which necessitated the classification of numerous indicators. This study develops four machine learning models (Artificial Neural Network, Support Vector Machine, Random Forest, and Naïve Bayes) based on the WQI and chemical parameters. The Langat Basin in Selangor dataset from the Department of Environment of Malaysia trains and validates each machine learning model. Several data preprocessing tasks such as data cleaning and feature selection have been conducted on the raw dataset to ensure the quality of the training data. The performance of these machine learning algorithms is further rectified based on the selected features set by several feature selection strategies such as information gain, correlation, and symmetrical uncertainty. Each classifier is then optimized using different tuning parameters to achieve optimum values before comparing the output of the three classifiers against each other. The observational results have shown that the optimized Random Forest classifier with the WQI parameter selected by the information gain feature selection method achieved the highest performance. The experimental results show that the WQI parameters are more relevant in predicting the WQI than the other variables. Consequently, this result shows that parameter oxygen (DO) and biochemical oxygen demand (BOD) are important features for predicting WQI. The proposed model achieved reasonable accuracy with minimal parameters, indicating that it could be used in real-time water quality detection systems.</abstract><cop>Oxford</cop><pub>Hindawi</pub><doi>10.1155/2022/3397972</doi><tpages>20</tpages><orcidid>https://orcid.org/0000-0001-5802-5946</orcidid><orcidid>https://orcid.org/0000-0003-3805-0532</orcidid><orcidid>https://orcid.org/0000-0002-8366-3533</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1530-8669 |
ispartof | Wireless communications and mobile computing, 2022-09, Vol.2022, p.1-20 |
issn | 1530-8669 1530-8677 |
language | eng |
recordid | cdi_proquest_journals_2715338421 |
source | Wiley Online Library Open Access; Publicly Available Content Database |
subjects | Accuracy Algorithms Artificial intelligence Artificial neural networks Biochemical oxygen demand Classification Classifiers Datasets Decision making Environmental impact Feature selection Groundwater International organizations Internet of Things Machine learning Mathematical models Methods Model accuracy Oxygen Parameters Pattern recognition systems Quality assessment Real time Statistical methods Support vector machines Water quality |
title | An Optimized Approach for Predicting Water Quality Features Based on Machine Learning |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T23%3A22%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20Optimized%20Approach%20for%20Predicting%20Water%20Quality%20Features%20Based%20on%20Machine%20Learning&rft.jtitle=Wireless%20communications%20and%20mobile%20computing&rft.au=Suwadi,%20Nur%20Afyfah&rft.date=2022-09-09&rft.volume=2022&rft.spage=1&rft.epage=20&rft.pages=1-20&rft.issn=1530-8669&rft.eissn=1530-8677&rft_id=info:doi/10.1155/2022/3397972&rft_dat=%3Cproquest_cross%3E2715338421%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c337t-a28010a95a7c048d975893fbe3a3fd16169f8f7d4f3a28fc63f8a61b3d56d3493%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2715338421&rft_id=info:pmid/&rfr_iscdi=true |