Loading…
Machine Learning Methods for the Prediction of Wastewater Treatment Efficiency and Anomaly Classification with Lack of Historical Data
This study examines an algorithm for collecting and analyzing data from wastewater treatment facilities, aimed at addressing regression tasks for predicting the quality of treated wastewater and classification tasks for preventing emergency situations, specifically filamentous bulking of activated s...
Saved in:
Published in: | Applied sciences 2024-11, Vol.14 (22), p.10689 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c255t-512d232e543f3966161dfe94d94dd8e858fdcb6aa77742527bd6161df729eff93 |
container_end_page | |
container_issue | 22 |
container_start_page | 10689 |
container_title | Applied sciences |
container_volume | 14 |
creator | Gulshin, Igor Kuzina, Olga |
description | This study examines an algorithm for collecting and analyzing data from wastewater treatment facilities, aimed at addressing regression tasks for predicting the quality of treated wastewater and classification tasks for preventing emergency situations, specifically filamentous bulking of activated sludge. The feasibility of using data obtained under laboratory conditions and simulating the technological process as a training dataset is explored. A small dataset collected from actual wastewater treatment plants is considered as the test dataset. For both regression and classification tasks, the best results were achieved using gradient-boosting models from the CatBoost family, yielding metrics of SMAPE = 9.1 and ROC-AUC = 1.0. A set of the most important predictors for modeling was selected for each of the target features. |
doi_str_mv | 10.3390/app142210689 |
format | article |
fullrecord | <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_cb84172dbee142d7aba77f76ae7e075e</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A817933659</galeid><doaj_id>oai_doaj_org_article_cb84172dbee142d7aba77f76ae7e075e</doaj_id><sourcerecordid>A817933659</sourcerecordid><originalsourceid>FETCH-LOGICAL-c255t-512d232e543f3966161dfe94d94dd8e858fdcb6aa77742527bd6161df729eff93</originalsourceid><addsrcrecordid>eNpNkc9qGzEQxpfSQEOaWx9A0Gud6s9K2j0aN20CDskhocdlVhrZcteSKykEv0Cfu3K2lEiCETPf_PiYaZpPjF4J0dOvcDiwlnNGVde_a8451WohWqbfv_l_aC5z3tF6eiY6Rs-bP3dgtj4gWSOk4MOG3GHZRpuJi4mULZKHhNab4mMg0ZGfkAu-QMFEHhNC2WMo5No5bzwGcyQQLFmGuIfpSFYT5OxrCV67X3zZkjWYXyfOjc8lplqayDco8LE5czBlvPwXL5qn79ePq5vF-v7H7Wq5XhguZVlIxi0XHGUrnOiVYopZh31r67MddrJz1owKQGvdcsn1aGeN5j0614uL5nbm2gi74ZD8HtJxiOCH10RMmwFS8WbCwYxdnRi3I2Kdq9UwVqrTClAj1RIr6_PMOqT4-xlzGXbxOYVqfxBM8K6lUumquppVG6hQH1wsCUy9FvfexIDO1_yyY7oXQsmTxS9zg0kx54Tuv01Gh9Omh7ebFn8BLmCcWA</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3132840567</pqid></control><display><type>article</type><title>Machine Learning Methods for the Prediction of Wastewater Treatment Efficiency and Anomaly Classification with Lack of Historical Data</title><source>Publicly Available Content Database</source><creator>Gulshin, Igor ; Kuzina, Olga</creator><creatorcontrib>Gulshin, Igor ; Kuzina, Olga</creatorcontrib><description>This study examines an algorithm for collecting and analyzing data from wastewater treatment facilities, aimed at addressing regression tasks for predicting the quality of treated wastewater and classification tasks for preventing emergency situations, specifically filamentous bulking of activated sludge. The feasibility of using data obtained under laboratory conditions and simulating the technological process as a training dataset is explored. A small dataset collected from actual wastewater treatment plants is considered as the test dataset. For both regression and classification tasks, the best results were achieved using gradient-boosting models from the CatBoost family, yielding metrics of SMAPE = 9.1 and ROC-AUC = 1.0. A set of the most important predictors for modeling was selected for each of the target features.</description><identifier>ISSN: 2076-3417</identifier><identifier>EISSN: 2076-3417</identifier><identifier>DOI: 10.3390/app142210689</identifier><language>eng</language><publisher>Basel: MDPI AG</publisher><subject>Algorithms ; Data collection ; effluent quality ; Energy consumption ; Machine learning ; machine learning algorithms ; Methods ; Neural networks ; Nitrogen ; Purification ; Sensors ; Sewage ; Sludge ; soft sensors ; Time series ; wastewater treatment ; Water treatment ; Water treatment plants ; Water utilities</subject><ispartof>Applied sciences, 2024-11, Vol.14 (22), p.10689</ispartof><rights>COPYRIGHT 2024 MDPI AG</rights><rights>2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c255t-512d232e543f3966161dfe94d94dd8e858fdcb6aa77742527bd6161df729eff93</cites><orcidid>0000-0003-0481-7897 ; 0000-0002-1671-7300</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/3132840567/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/3132840567?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,25753,27924,27925,37012,44590,75126</link.rule.ids></links><search><creatorcontrib>Gulshin, Igor</creatorcontrib><creatorcontrib>Kuzina, Olga</creatorcontrib><title>Machine Learning Methods for the Prediction of Wastewater Treatment Efficiency and Anomaly Classification with Lack of Historical Data</title><title>Applied sciences</title><description>This study examines an algorithm for collecting and analyzing data from wastewater treatment facilities, aimed at addressing regression tasks for predicting the quality of treated wastewater and classification tasks for preventing emergency situations, specifically filamentous bulking of activated sludge. The feasibility of using data obtained under laboratory conditions and simulating the technological process as a training dataset is explored. A small dataset collected from actual wastewater treatment plants is considered as the test dataset. For both regression and classification tasks, the best results were achieved using gradient-boosting models from the CatBoost family, yielding metrics of SMAPE = 9.1 and ROC-AUC = 1.0. A set of the most important predictors for modeling was selected for each of the target features.</description><subject>Algorithms</subject><subject>Data collection</subject><subject>effluent quality</subject><subject>Energy consumption</subject><subject>Machine learning</subject><subject>machine learning algorithms</subject><subject>Methods</subject><subject>Neural networks</subject><subject>Nitrogen</subject><subject>Purification</subject><subject>Sensors</subject><subject>Sewage</subject><subject>Sludge</subject><subject>soft sensors</subject><subject>Time series</subject><subject>wastewater treatment</subject><subject>Water treatment</subject><subject>Water treatment plants</subject><subject>Water utilities</subject><issn>2076-3417</issn><issn>2076-3417</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNpNkc9qGzEQxpfSQEOaWx9A0Gud6s9K2j0aN20CDskhocdlVhrZcteSKykEv0Cfu3K2lEiCETPf_PiYaZpPjF4J0dOvcDiwlnNGVde_a8451WohWqbfv_l_aC5z3tF6eiY6Rs-bP3dgtj4gWSOk4MOG3GHZRpuJi4mULZKHhNab4mMg0ZGfkAu-QMFEHhNC2WMo5No5bzwGcyQQLFmGuIfpSFYT5OxrCV67X3zZkjWYXyfOjc8lplqayDco8LE5czBlvPwXL5qn79ePq5vF-v7H7Wq5XhguZVlIxi0XHGUrnOiVYopZh31r67MddrJz1owKQGvdcsn1aGeN5j0614uL5nbm2gi74ZD8HtJxiOCH10RMmwFS8WbCwYxdnRi3I2Kdq9UwVqrTClAj1RIr6_PMOqT4-xlzGXbxOYVqfxBM8K6lUumquppVG6hQH1wsCUy9FvfexIDO1_yyY7oXQsmTxS9zg0kx54Tuv01Gh9Omh7ebFn8BLmCcWA</recordid><startdate>20241101</startdate><enddate>20241101</enddate><creator>Gulshin, Igor</creator><creator>Kuzina, Olga</creator><general>MDPI AG</general><scope>AAYXX</scope><scope>CITATION</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0003-0481-7897</orcidid><orcidid>https://orcid.org/0000-0002-1671-7300</orcidid></search><sort><creationdate>20241101</creationdate><title>Machine Learning Methods for the Prediction of Wastewater Treatment Efficiency and Anomaly Classification with Lack of Historical Data</title><author>Gulshin, Igor ; Kuzina, Olga</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c255t-512d232e543f3966161dfe94d94dd8e858fdcb6aa77742527bd6161df729eff93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Data collection</topic><topic>effluent quality</topic><topic>Energy consumption</topic><topic>Machine learning</topic><topic>machine learning algorithms</topic><topic>Methods</topic><topic>Neural networks</topic><topic>Nitrogen</topic><topic>Purification</topic><topic>Sensors</topic><topic>Sewage</topic><topic>Sludge</topic><topic>soft sensors</topic><topic>Time series</topic><topic>wastewater treatment</topic><topic>Water treatment</topic><topic>Water treatment plants</topic><topic>Water utilities</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gulshin, Igor</creatorcontrib><creatorcontrib>Kuzina, Olga</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>Applied sciences</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gulshin, Igor</au><au>Kuzina, Olga</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Machine Learning Methods for the Prediction of Wastewater Treatment Efficiency and Anomaly Classification with Lack of Historical Data</atitle><jtitle>Applied sciences</jtitle><date>2024-11-01</date><risdate>2024</risdate><volume>14</volume><issue>22</issue><spage>10689</spage><pages>10689-</pages><issn>2076-3417</issn><eissn>2076-3417</eissn><abstract>This study examines an algorithm for collecting and analyzing data from wastewater treatment facilities, aimed at addressing regression tasks for predicting the quality of treated wastewater and classification tasks for preventing emergency situations, specifically filamentous bulking of activated sludge. The feasibility of using data obtained under laboratory conditions and simulating the technological process as a training dataset is explored. A small dataset collected from actual wastewater treatment plants is considered as the test dataset. For both regression and classification tasks, the best results were achieved using gradient-boosting models from the CatBoost family, yielding metrics of SMAPE = 9.1 and ROC-AUC = 1.0. A set of the most important predictors for modeling was selected for each of the target features.</abstract><cop>Basel</cop><pub>MDPI AG</pub><doi>10.3390/app142210689</doi><orcidid>https://orcid.org/0000-0003-0481-7897</orcidid><orcidid>https://orcid.org/0000-0002-1671-7300</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2076-3417 |
ispartof | Applied sciences, 2024-11, Vol.14 (22), p.10689 |
issn | 2076-3417 2076-3417 |
language | eng |
recordid | cdi_doaj_primary_oai_doaj_org_article_cb84172dbee142d7aba77f76ae7e075e |
source | Publicly Available Content Database |
subjects | Algorithms Data collection effluent quality Energy consumption Machine learning machine learning algorithms Methods Neural networks Nitrogen Purification Sensors Sewage Sludge soft sensors Time series wastewater treatment Water treatment Water treatment plants Water utilities |
title | Machine Learning Methods for the Prediction of Wastewater Treatment Efficiency and Anomaly Classification with Lack of Historical Data |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T10%3A15%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Machine%20Learning%20Methods%20for%20the%20Prediction%20of%20Wastewater%20Treatment%20Efficiency%20and%20Anomaly%20Classification%20with%20Lack%20of%20Historical%20Data&rft.jtitle=Applied%20sciences&rft.au=Gulshin,%20Igor&rft.date=2024-11-01&rft.volume=14&rft.issue=22&rft.spage=10689&rft.pages=10689-&rft.issn=2076-3417&rft.eissn=2076-3417&rft_id=info:doi/10.3390/app142210689&rft_dat=%3Cgale_doaj_%3EA817933659%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c255t-512d232e543f3966161dfe94d94dd8e858fdcb6aa77742527bd6161df729eff93%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3132840567&rft_id=info:pmid/&rft_galeid=A817933659&rfr_iscdi=true |