Loading…

An Optimized Approach for Predicting Water Quality Features Based on Machine Learning

Traditionally, water quality is assessed using costly laboratory and statistical methods, rendering real-time monitoring useless. Poor water quality requires a more practical and cost-effective solution. The machine learning classification approach appears promising for rapid detection and predictio...

Full description

Saved in:
Bibliographic Details
Published in:Wireless communications and mobile computing 2022-09, Vol.2022, p.1-20
Main Authors: Suwadi, Nur Afyfah, Derbali, Morched, Sani, Nor Samsiah, Lam, Meng Chun, Arshad, Haslina, Khan, Imran, Kim, Ki-Il
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c337t-a28010a95a7c048d975893fbe3a3fd16169f8f7d4f3a28fc63f8a61b3d56d3493
cites cdi_FETCH-LOGICAL-c337t-a28010a95a7c048d975893fbe3a3fd16169f8f7d4f3a28fc63f8a61b3d56d3493
container_end_page 20
container_issue
container_start_page 1
container_title Wireless communications and mobile computing
container_volume 2022
creator Suwadi, Nur Afyfah
Derbali, Morched
Sani, Nor Samsiah
Lam, Meng Chun
Arshad, Haslina
Khan, Imran
Kim, Ki-Il
description Traditionally, water quality is assessed using costly laboratory and statistical methods, rendering real-time monitoring useless. Poor water quality requires a more practical and cost-effective solution. The machine learning classification approach appears promising for rapid detection and prediction of water quality. Machine learning has been used successfully to predict water quality. However, research on machine learning for water quality index (WQI) prediction is generally lacking. Therefore, this research aims to identify the important features for the WQI, which necessitated the classification of numerous indicators. This study develops four machine learning models (Artificial Neural Network, Support Vector Machine, Random Forest, and Naïve Bayes) based on the WQI and chemical parameters. The Langat Basin in Selangor dataset from the Department of Environment of Malaysia trains and validates each machine learning model. Several data preprocessing tasks such as data cleaning and feature selection have been conducted on the raw dataset to ensure the quality of the training data. The performance of these machine learning algorithms is further rectified based on the selected features set by several feature selection strategies such as information gain, correlation, and symmetrical uncertainty. Each classifier is then optimized using different tuning parameters to achieve optimum values before comparing the output of the three classifiers against each other. The observational results have shown that the optimized Random Forest classifier with the WQI parameter selected by the information gain feature selection method achieved the highest performance. The experimental results show that the WQI parameters are more relevant in predicting the WQI than the other variables. Consequently, this result shows that parameter oxygen (DO) and biochemical oxygen demand (BOD) are important features for predicting WQI. The proposed model achieved reasonable accuracy with minimal parameters, indicating that it could be used in real-time water quality detection systems.
doi_str_mv 10.1155/2022/3397972
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2715338421</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2715338421</sourcerecordid><originalsourceid>FETCH-LOGICAL-c337t-a28010a95a7c048d975893fbe3a3fd16169f8f7d4f3a28fc63f8a61b3d56d3493</originalsourceid><addsrcrecordid>eNp9kEFLAzEQhYMoWKs3f0DAo65NMrub5FiLVaFSBYvHJd0kmtLurkkWqb_elBaPnmYO33sz7yF0ScktpUUxYoSxEYDkkrMjNKAFkEyUnB__7aU8RWchrAghQBgdoMW4wfMuuo37MRqPu863qv7EtvX4xRvt6uiaD_yuovH4tVdrF7d4alTsvQn4ToUkahv8nDSuMXhmlG-S4BydWLUO5uIwh2gxvX-bPGaz-cPTZDzLagAeM8UEoUTJQvGa5EJLXggJdmlAgdW0pKW0wnKdW0iorUuwQpV0CbooNeQShuhq75ve_upNiNWq7X2TTlaMp8ggckYTdbOnat-G4I2tOu82ym8rSqpdcdWuuOpQXMKv93iKpNW3-5_-BXn2bA8</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2715338421</pqid></control><display><type>article</type><title>An Optimized Approach for Predicting Water Quality Features Based on Machine Learning</title><source>Wiley Online Library Open Access</source><source>Publicly Available Content Database</source><creator>Suwadi, Nur Afyfah ; Derbali, Morched ; Sani, Nor Samsiah ; Lam, Meng Chun ; Arshad, Haslina ; Khan, Imran ; Kim, Ki-Il</creator><contributor>Shuja, Junaid ; Junaid Shuja</contributor><creatorcontrib>Suwadi, Nur Afyfah ; Derbali, Morched ; Sani, Nor Samsiah ; Lam, Meng Chun ; Arshad, Haslina ; Khan, Imran ; Kim, Ki-Il ; Shuja, Junaid ; Junaid Shuja</creatorcontrib><description>Traditionally, water quality is assessed using costly laboratory and statistical methods, rendering real-time monitoring useless. Poor water quality requires a more practical and cost-effective solution. The machine learning classification approach appears promising for rapid detection and prediction of water quality. Machine learning has been used successfully to predict water quality. However, research on machine learning for water quality index (WQI) prediction is generally lacking. Therefore, this research aims to identify the important features for the WQI, which necessitated the classification of numerous indicators. This study develops four machine learning models (Artificial Neural Network, Support Vector Machine, Random Forest, and Naïve Bayes) based on the WQI and chemical parameters. The Langat Basin in Selangor dataset from the Department of Environment of Malaysia trains and validates each machine learning model. Several data preprocessing tasks such as data cleaning and feature selection have been conducted on the raw dataset to ensure the quality of the training data. The performance of these machine learning algorithms is further rectified based on the selected features set by several feature selection strategies such as information gain, correlation, and symmetrical uncertainty. Each classifier is then optimized using different tuning parameters to achieve optimum values before comparing the output of the three classifiers against each other. The observational results have shown that the optimized Random Forest classifier with the WQI parameter selected by the information gain feature selection method achieved the highest performance. The experimental results show that the WQI parameters are more relevant in predicting the WQI than the other variables. Consequently, this result shows that parameter oxygen (DO) and biochemical oxygen demand (BOD) are important features for predicting WQI. The proposed model achieved reasonable accuracy with minimal parameters, indicating that it could be used in real-time water quality detection systems.</description><identifier>ISSN: 1530-8669</identifier><identifier>EISSN: 1530-8677</identifier><identifier>DOI: 10.1155/2022/3397972</identifier><language>eng</language><publisher>Oxford: Hindawi</publisher><subject>Accuracy ; Algorithms ; Artificial intelligence ; Artificial neural networks ; Biochemical oxygen demand ; Classification ; Classifiers ; Datasets ; Decision making ; Environmental impact ; Feature selection ; Groundwater ; International organizations ; Internet of Things ; Machine learning ; Mathematical models ; Methods ; Model accuracy ; Oxygen ; Parameters ; Pattern recognition systems ; Quality assessment ; Real time ; Statistical methods ; Support vector machines ; Water quality</subject><ispartof>Wireless communications and mobile computing, 2022-09, Vol.2022, p.1-20</ispartof><rights>Copyright © 2022 Nur Afyfah Suwadi et al.</rights><rights>Copyright © 2022 Nur Afyfah Suwadi et al. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c337t-a28010a95a7c048d975893fbe3a3fd16169f8f7d4f3a28fc63f8a61b3d56d3493</citedby><cites>FETCH-LOGICAL-c337t-a28010a95a7c048d975893fbe3a3fd16169f8f7d4f3a28fc63f8a61b3d56d3493</cites><orcidid>0000-0001-5802-5946 ; 0000-0003-3805-0532 ; 0000-0002-8366-3533</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2715338421/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2715338421?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,25753,27924,27925,37012,44590,75126</link.rule.ids></links><search><contributor>Shuja, Junaid</contributor><contributor>Junaid Shuja</contributor><creatorcontrib>Suwadi, Nur Afyfah</creatorcontrib><creatorcontrib>Derbali, Morched</creatorcontrib><creatorcontrib>Sani, Nor Samsiah</creatorcontrib><creatorcontrib>Lam, Meng Chun</creatorcontrib><creatorcontrib>Arshad, Haslina</creatorcontrib><creatorcontrib>Khan, Imran</creatorcontrib><creatorcontrib>Kim, Ki-Il</creatorcontrib><title>An Optimized Approach for Predicting Water Quality Features Based on Machine Learning</title><title>Wireless communications and mobile computing</title><description>Traditionally, water quality is assessed using costly laboratory and statistical methods, rendering real-time monitoring useless. Poor water quality requires a more practical and cost-effective solution. The machine learning classification approach appears promising for rapid detection and prediction of water quality. Machine learning has been used successfully to predict water quality. However, research on machine learning for water quality index (WQI) prediction is generally lacking. Therefore, this research aims to identify the important features for the WQI, which necessitated the classification of numerous indicators. This study develops four machine learning models (Artificial Neural Network, Support Vector Machine, Random Forest, and Naïve Bayes) based on the WQI and chemical parameters. The Langat Basin in Selangor dataset from the Department of Environment of Malaysia trains and validates each machine learning model. Several data preprocessing tasks such as data cleaning and feature selection have been conducted on the raw dataset to ensure the quality of the training data. The performance of these machine learning algorithms is further rectified based on the selected features set by several feature selection strategies such as information gain, correlation, and symmetrical uncertainty. Each classifier is then optimized using different tuning parameters to achieve optimum values before comparing the output of the three classifiers against each other. The observational results have shown that the optimized Random Forest classifier with the WQI parameter selected by the information gain feature selection method achieved the highest performance. The experimental results show that the WQI parameters are more relevant in predicting the WQI than the other variables. Consequently, this result shows that parameter oxygen (DO) and biochemical oxygen demand (BOD) are important features for predicting WQI. The proposed model achieved reasonable accuracy with minimal parameters, indicating that it could be used in real-time water quality detection systems.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial intelligence</subject><subject>Artificial neural networks</subject><subject>Biochemical oxygen demand</subject><subject>Classification</subject><subject>Classifiers</subject><subject>Datasets</subject><subject>Decision making</subject><subject>Environmental impact</subject><subject>Feature selection</subject><subject>Groundwater</subject><subject>International organizations</subject><subject>Internet of Things</subject><subject>Machine learning</subject><subject>Mathematical models</subject><subject>Methods</subject><subject>Model accuracy</subject><subject>Oxygen</subject><subject>Parameters</subject><subject>Pattern recognition systems</subject><subject>Quality assessment</subject><subject>Real time</subject><subject>Statistical methods</subject><subject>Support vector machines</subject><subject>Water quality</subject><issn>1530-8669</issn><issn>1530-8677</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNp9kEFLAzEQhYMoWKs3f0DAo65NMrub5FiLVaFSBYvHJd0kmtLurkkWqb_elBaPnmYO33sz7yF0ScktpUUxYoSxEYDkkrMjNKAFkEyUnB__7aU8RWchrAghQBgdoMW4wfMuuo37MRqPu863qv7EtvX4xRvt6uiaD_yuovH4tVdrF7d4alTsvQn4ToUkahv8nDSuMXhmlG-S4BydWLUO5uIwh2gxvX-bPGaz-cPTZDzLagAeM8UEoUTJQvGa5EJLXggJdmlAgdW0pKW0wnKdW0iorUuwQpV0CbooNeQShuhq75ve_upNiNWq7X2TTlaMp8ggckYTdbOnat-G4I2tOu82ym8rSqpdcdWuuOpQXMKv93iKpNW3-5_-BXn2bA8</recordid><startdate>20220909</startdate><enddate>20220909</enddate><creator>Suwadi, Nur Afyfah</creator><creator>Derbali, Morched</creator><creator>Sani, Nor Samsiah</creator><creator>Lam, Meng Chun</creator><creator>Arshad, Haslina</creator><creator>Khan, Imran</creator><creator>Kim, Ki-Il</creator><general>Hindawi</general><general>Hindawi Limited</general><scope>RHU</scope><scope>RHW</scope><scope>RHX</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7XB</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0001-5802-5946</orcidid><orcidid>https://orcid.org/0000-0003-3805-0532</orcidid><orcidid>https://orcid.org/0000-0002-8366-3533</orcidid></search><sort><creationdate>20220909</creationdate><title>An Optimized Approach for Predicting Water Quality Features Based on Machine Learning</title><author>Suwadi, Nur Afyfah ; Derbali, Morched ; Sani, Nor Samsiah ; Lam, Meng Chun ; Arshad, Haslina ; Khan, Imran ; Kim, Ki-Il</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c337t-a28010a95a7c048d975893fbe3a3fd16169f8f7d4f3a28fc63f8a61b3d56d3493</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial intelligence</topic><topic>Artificial neural networks</topic><topic>Biochemical oxygen demand</topic><topic>Classification</topic><topic>Classifiers</topic><topic>Datasets</topic><topic>Decision making</topic><topic>Environmental impact</topic><topic>Feature selection</topic><topic>Groundwater</topic><topic>International organizations</topic><topic>Internet of Things</topic><topic>Machine learning</topic><topic>Mathematical models</topic><topic>Methods</topic><topic>Model accuracy</topic><topic>Oxygen</topic><topic>Parameters</topic><topic>Pattern recognition systems</topic><topic>Quality assessment</topic><topic>Real time</topic><topic>Statistical methods</topic><topic>Support vector machines</topic><topic>Water quality</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Suwadi, Nur Afyfah</creatorcontrib><creatorcontrib>Derbali, Morched</creatorcontrib><creatorcontrib>Sani, Nor Samsiah</creatorcontrib><creatorcontrib>Lam, Meng Chun</creatorcontrib><creatorcontrib>Arshad, Haslina</creatorcontrib><creatorcontrib>Khan, Imran</creatorcontrib><creatorcontrib>Kim, Ki-Il</creatorcontrib><collection>Hindawi Publishing Complete</collection><collection>Hindawi Publishing Subscription Journals</collection><collection>Hindawi Publishing Open Access</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer science database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>ProQuest advanced technologies &amp; aerospace journals</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><jtitle>Wireless communications and mobile computing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Suwadi, Nur Afyfah</au><au>Derbali, Morched</au><au>Sani, Nor Samsiah</au><au>Lam, Meng Chun</au><au>Arshad, Haslina</au><au>Khan, Imran</au><au>Kim, Ki-Il</au><au>Shuja, Junaid</au><au>Junaid Shuja</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An Optimized Approach for Predicting Water Quality Features Based on Machine Learning</atitle><jtitle>Wireless communications and mobile computing</jtitle><date>2022-09-09</date><risdate>2022</risdate><volume>2022</volume><spage>1</spage><epage>20</epage><pages>1-20</pages><issn>1530-8669</issn><eissn>1530-8677</eissn><abstract>Traditionally, water quality is assessed using costly laboratory and statistical methods, rendering real-time monitoring useless. Poor water quality requires a more practical and cost-effective solution. The machine learning classification approach appears promising for rapid detection and prediction of water quality. Machine learning has been used successfully to predict water quality. However, research on machine learning for water quality index (WQI) prediction is generally lacking. Therefore, this research aims to identify the important features for the WQI, which necessitated the classification of numerous indicators. This study develops four machine learning models (Artificial Neural Network, Support Vector Machine, Random Forest, and Naïve Bayes) based on the WQI and chemical parameters. The Langat Basin in Selangor dataset from the Department of Environment of Malaysia trains and validates each machine learning model. Several data preprocessing tasks such as data cleaning and feature selection have been conducted on the raw dataset to ensure the quality of the training data. The performance of these machine learning algorithms is further rectified based on the selected features set by several feature selection strategies such as information gain, correlation, and symmetrical uncertainty. Each classifier is then optimized using different tuning parameters to achieve optimum values before comparing the output of the three classifiers against each other. The observational results have shown that the optimized Random Forest classifier with the WQI parameter selected by the information gain feature selection method achieved the highest performance. The experimental results show that the WQI parameters are more relevant in predicting the WQI than the other variables. Consequently, this result shows that parameter oxygen (DO) and biochemical oxygen demand (BOD) are important features for predicting WQI. The proposed model achieved reasonable accuracy with minimal parameters, indicating that it could be used in real-time water quality detection systems.</abstract><cop>Oxford</cop><pub>Hindawi</pub><doi>10.1155/2022/3397972</doi><tpages>20</tpages><orcidid>https://orcid.org/0000-0001-5802-5946</orcidid><orcidid>https://orcid.org/0000-0003-3805-0532</orcidid><orcidid>https://orcid.org/0000-0002-8366-3533</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1530-8669
ispartof Wireless communications and mobile computing, 2022-09, Vol.2022, p.1-20
issn 1530-8669
1530-8677
language eng
recordid cdi_proquest_journals_2715338421
source Wiley Online Library Open Access; Publicly Available Content Database
subjects Accuracy
Algorithms
Artificial intelligence
Artificial neural networks
Biochemical oxygen demand
Classification
Classifiers
Datasets
Decision making
Environmental impact
Feature selection
Groundwater
International organizations
Internet of Things
Machine learning
Mathematical models
Methods
Model accuracy
Oxygen
Parameters
Pattern recognition systems
Quality assessment
Real time
Statistical methods
Support vector machines
Water quality
title An Optimized Approach for Predicting Water Quality Features Based on Machine Learning
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T23%3A22%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20Optimized%20Approach%20for%20Predicting%20Water%20Quality%20Features%20Based%20on%20Machine%20Learning&rft.jtitle=Wireless%20communications%20and%20mobile%20computing&rft.au=Suwadi,%20Nur%20Afyfah&rft.date=2022-09-09&rft.volume=2022&rft.spage=1&rft.epage=20&rft.pages=1-20&rft.issn=1530-8669&rft.eissn=1530-8677&rft_id=info:doi/10.1155/2022/3397972&rft_dat=%3Cproquest_cross%3E2715338421%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c337t-a28010a95a7c048d975893fbe3a3fd16169f8f7d4f3a28fc63f8a61b3d56d3493%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2715338421&rft_id=info:pmid/&rfr_iscdi=true