Loading…

Machine Learning Methods for the Prediction of Wastewater Treatment Efficiency and Anomaly Classification with Lack of Historical Data

This study examines an algorithm for collecting and analyzing data from wastewater treatment facilities, aimed at addressing regression tasks for predicting the quality of treated wastewater and classification tasks for preventing emergency situations, specifically filamentous bulking of activated s...

Full description

Saved in:
Bibliographic Details
Published in:Applied sciences 2024-11, Vol.14 (22), p.10689
Main Authors: Gulshin, Igor, Kuzina, Olga
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-c255t-512d232e543f3966161dfe94d94dd8e858fdcb6aa77742527bd6161df729eff93
container_end_page
container_issue 22
container_start_page 10689
container_title Applied sciences
container_volume 14
creator Gulshin, Igor
Kuzina, Olga
description This study examines an algorithm for collecting and analyzing data from wastewater treatment facilities, aimed at addressing regression tasks for predicting the quality of treated wastewater and classification tasks for preventing emergency situations, specifically filamentous bulking of activated sludge. The feasibility of using data obtained under laboratory conditions and simulating the technological process as a training dataset is explored. A small dataset collected from actual wastewater treatment plants is considered as the test dataset. For both regression and classification tasks, the best results were achieved using gradient-boosting models from the CatBoost family, yielding metrics of SMAPE = 9.1 and ROC-AUC = 1.0. A set of the most important predictors for modeling was selected for each of the target features.
doi_str_mv 10.3390/app142210689
format article
fullrecord <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_cb84172dbee142d7aba77f76ae7e075e</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A817933659</galeid><doaj_id>oai_doaj_org_article_cb84172dbee142d7aba77f76ae7e075e</doaj_id><sourcerecordid>A817933659</sourcerecordid><originalsourceid>FETCH-LOGICAL-c255t-512d232e543f3966161dfe94d94dd8e858fdcb6aa77742527bd6161df729eff93</originalsourceid><addsrcrecordid>eNpNkc9qGzEQxpfSQEOaWx9A0Gud6s9K2j0aN20CDskhocdlVhrZcteSKykEv0Cfu3K2lEiCETPf_PiYaZpPjF4J0dOvcDiwlnNGVde_a8451WohWqbfv_l_aC5z3tF6eiY6Rs-bP3dgtj4gWSOk4MOG3GHZRpuJi4mULZKHhNab4mMg0ZGfkAu-QMFEHhNC2WMo5No5bzwGcyQQLFmGuIfpSFYT5OxrCV67X3zZkjWYXyfOjc8lplqayDco8LE5czBlvPwXL5qn79ePq5vF-v7H7Wq5XhguZVlIxi0XHGUrnOiVYopZh31r67MddrJz1owKQGvdcsn1aGeN5j0614uL5nbm2gi74ZD8HtJxiOCH10RMmwFS8WbCwYxdnRi3I2Kdq9UwVqrTClAj1RIr6_PMOqT4-xlzGXbxOYVqfxBM8K6lUumquppVG6hQH1wsCUy9FvfexIDO1_yyY7oXQsmTxS9zg0kx54Tuv01Gh9Omh7ebFn8BLmCcWA</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3132840567</pqid></control><display><type>article</type><title>Machine Learning Methods for the Prediction of Wastewater Treatment Efficiency and Anomaly Classification with Lack of Historical Data</title><source>Publicly Available Content Database</source><creator>Gulshin, Igor ; Kuzina, Olga</creator><creatorcontrib>Gulshin, Igor ; Kuzina, Olga</creatorcontrib><description>This study examines an algorithm for collecting and analyzing data from wastewater treatment facilities, aimed at addressing regression tasks for predicting the quality of treated wastewater and classification tasks for preventing emergency situations, specifically filamentous bulking of activated sludge. The feasibility of using data obtained under laboratory conditions and simulating the technological process as a training dataset is explored. A small dataset collected from actual wastewater treatment plants is considered as the test dataset. For both regression and classification tasks, the best results were achieved using gradient-boosting models from the CatBoost family, yielding metrics of SMAPE = 9.1 and ROC-AUC = 1.0. A set of the most important predictors for modeling was selected for each of the target features.</description><identifier>ISSN: 2076-3417</identifier><identifier>EISSN: 2076-3417</identifier><identifier>DOI: 10.3390/app142210689</identifier><language>eng</language><publisher>Basel: MDPI AG</publisher><subject>Algorithms ; Data collection ; effluent quality ; Energy consumption ; Machine learning ; machine learning algorithms ; Methods ; Neural networks ; Nitrogen ; Purification ; Sensors ; Sewage ; Sludge ; soft sensors ; Time series ; wastewater treatment ; Water treatment ; Water treatment plants ; Water utilities</subject><ispartof>Applied sciences, 2024-11, Vol.14 (22), p.10689</ispartof><rights>COPYRIGHT 2024 MDPI AG</rights><rights>2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c255t-512d232e543f3966161dfe94d94dd8e858fdcb6aa77742527bd6161df729eff93</cites><orcidid>0000-0003-0481-7897 ; 0000-0002-1671-7300</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/3132840567/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/3132840567?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,25753,27924,27925,37012,44590,75126</link.rule.ids></links><search><creatorcontrib>Gulshin, Igor</creatorcontrib><creatorcontrib>Kuzina, Olga</creatorcontrib><title>Machine Learning Methods for the Prediction of Wastewater Treatment Efficiency and Anomaly Classification with Lack of Historical Data</title><title>Applied sciences</title><description>This study examines an algorithm for collecting and analyzing data from wastewater treatment facilities, aimed at addressing regression tasks for predicting the quality of treated wastewater and classification tasks for preventing emergency situations, specifically filamentous bulking of activated sludge. The feasibility of using data obtained under laboratory conditions and simulating the technological process as a training dataset is explored. A small dataset collected from actual wastewater treatment plants is considered as the test dataset. For both regression and classification tasks, the best results were achieved using gradient-boosting models from the CatBoost family, yielding metrics of SMAPE = 9.1 and ROC-AUC = 1.0. A set of the most important predictors for modeling was selected for each of the target features.</description><subject>Algorithms</subject><subject>Data collection</subject><subject>effluent quality</subject><subject>Energy consumption</subject><subject>Machine learning</subject><subject>machine learning algorithms</subject><subject>Methods</subject><subject>Neural networks</subject><subject>Nitrogen</subject><subject>Purification</subject><subject>Sensors</subject><subject>Sewage</subject><subject>Sludge</subject><subject>soft sensors</subject><subject>Time series</subject><subject>wastewater treatment</subject><subject>Water treatment</subject><subject>Water treatment plants</subject><subject>Water utilities</subject><issn>2076-3417</issn><issn>2076-3417</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNpNkc9qGzEQxpfSQEOaWx9A0Gud6s9K2j0aN20CDskhocdlVhrZcteSKykEv0Cfu3K2lEiCETPf_PiYaZpPjF4J0dOvcDiwlnNGVde_a8451WohWqbfv_l_aC5z3tF6eiY6Rs-bP3dgtj4gWSOk4MOG3GHZRpuJi4mULZKHhNab4mMg0ZGfkAu-QMFEHhNC2WMo5No5bzwGcyQQLFmGuIfpSFYT5OxrCV67X3zZkjWYXyfOjc8lplqayDco8LE5czBlvPwXL5qn79ePq5vF-v7H7Wq5XhguZVlIxi0XHGUrnOiVYopZh31r67MddrJz1owKQGvdcsn1aGeN5j0614uL5nbm2gi74ZD8HtJxiOCH10RMmwFS8WbCwYxdnRi3I2Kdq9UwVqrTClAj1RIr6_PMOqT4-xlzGXbxOYVqfxBM8K6lUumquppVG6hQH1wsCUy9FvfexIDO1_yyY7oXQsmTxS9zg0kx54Tuv01Gh9Omh7ebFn8BLmCcWA</recordid><startdate>20241101</startdate><enddate>20241101</enddate><creator>Gulshin, Igor</creator><creator>Kuzina, Olga</creator><general>MDPI AG</general><scope>AAYXX</scope><scope>CITATION</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0003-0481-7897</orcidid><orcidid>https://orcid.org/0000-0002-1671-7300</orcidid></search><sort><creationdate>20241101</creationdate><title>Machine Learning Methods for the Prediction of Wastewater Treatment Efficiency and Anomaly Classification with Lack of Historical Data</title><author>Gulshin, Igor ; Kuzina, Olga</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c255t-512d232e543f3966161dfe94d94dd8e858fdcb6aa77742527bd6161df729eff93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Data collection</topic><topic>effluent quality</topic><topic>Energy consumption</topic><topic>Machine learning</topic><topic>machine learning algorithms</topic><topic>Methods</topic><topic>Neural networks</topic><topic>Nitrogen</topic><topic>Purification</topic><topic>Sensors</topic><topic>Sewage</topic><topic>Sludge</topic><topic>soft sensors</topic><topic>Time series</topic><topic>wastewater treatment</topic><topic>Water treatment</topic><topic>Water treatment plants</topic><topic>Water utilities</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gulshin, Igor</creatorcontrib><creatorcontrib>Kuzina, Olga</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>Applied sciences</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gulshin, Igor</au><au>Kuzina, Olga</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Machine Learning Methods for the Prediction of Wastewater Treatment Efficiency and Anomaly Classification with Lack of Historical Data</atitle><jtitle>Applied sciences</jtitle><date>2024-11-01</date><risdate>2024</risdate><volume>14</volume><issue>22</issue><spage>10689</spage><pages>10689-</pages><issn>2076-3417</issn><eissn>2076-3417</eissn><abstract>This study examines an algorithm for collecting and analyzing data from wastewater treatment facilities, aimed at addressing regression tasks for predicting the quality of treated wastewater and classification tasks for preventing emergency situations, specifically filamentous bulking of activated sludge. The feasibility of using data obtained under laboratory conditions and simulating the technological process as a training dataset is explored. A small dataset collected from actual wastewater treatment plants is considered as the test dataset. For both regression and classification tasks, the best results were achieved using gradient-boosting models from the CatBoost family, yielding metrics of SMAPE = 9.1 and ROC-AUC = 1.0. A set of the most important predictors for modeling was selected for each of the target features.</abstract><cop>Basel</cop><pub>MDPI AG</pub><doi>10.3390/app142210689</doi><orcidid>https://orcid.org/0000-0003-0481-7897</orcidid><orcidid>https://orcid.org/0000-0002-1671-7300</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2076-3417
ispartof Applied sciences, 2024-11, Vol.14 (22), p.10689
issn 2076-3417
2076-3417
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_cb84172dbee142d7aba77f76ae7e075e
source Publicly Available Content Database
subjects Algorithms
Data collection
effluent quality
Energy consumption
Machine learning
machine learning algorithms
Methods
Neural networks
Nitrogen
Purification
Sensors
Sewage
Sludge
soft sensors
Time series
wastewater treatment
Water treatment
Water treatment plants
Water utilities
title Machine Learning Methods for the Prediction of Wastewater Treatment Efficiency and Anomaly Classification with Lack of Historical Data
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T10%3A15%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Machine%20Learning%20Methods%20for%20the%20Prediction%20of%20Wastewater%20Treatment%20Efficiency%20and%20Anomaly%20Classification%20with%20Lack%20of%20Historical%20Data&rft.jtitle=Applied%20sciences&rft.au=Gulshin,%20Igor&rft.date=2024-11-01&rft.volume=14&rft.issue=22&rft.spage=10689&rft.pages=10689-&rft.issn=2076-3417&rft.eissn=2076-3417&rft_id=info:doi/10.3390/app142210689&rft_dat=%3Cgale_doaj_%3EA817933659%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c255t-512d232e543f3966161dfe94d94dd8e858fdcb6aa77742527bd6161df729eff93%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3132840567&rft_id=info:pmid/&rft_galeid=A817933659&rfr_iscdi=true