Loading…
Multiple Data Imputation Methods Advance Risk Analysis and Treatability of Co-occurring Inorganic Chemicals in Groundwater
Accurately assessing and managing risks associated with inorganic pollutants in groundwater is imperative. Historic water quality databases are often sparse due to rationale or financial budgets for sample collection and analysis, posing challenges in evaluating exposure or water treatment effective...
Saved in:
Published in: | Environmental science & technology 2024-11, Vol.58 (46), p.20513-20524 |
---|---|
Main Authors: | , , , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-a314t-8443393010103b3cc38a14e74c0453526d8f0e7f90e1f84e8c98d2ecd09f1dd83 |
container_end_page | 20524 |
container_issue | 46 |
container_start_page | 20513 |
container_title | Environmental science & technology |
container_volume | 58 |
creator | Mahmood, Akhlak U. Islam, Minhazul Gulyuk, Alexey V. Briese, Emily Velasco, Carmen A. Malu, Mohit Sharma, Naushita Spanias, Andreas Yingling, Yaroslava G. Westerhoff, Paul |
description | Accurately assessing and managing risks associated with inorganic pollutants in groundwater is imperative. Historic water quality databases are often sparse due to rationale or financial budgets for sample collection and analysis, posing challenges in evaluating exposure or water treatment effectiveness. We utilized and compared two advanced multiple data imputation techniques, AMELIA and MICE algorithms, to fill gaps in sparse groundwater quality data sets. AMELIA outperformed MICE in handling missing values, as MICE tended to overestimate certain values, resulting in more outliers. Field data sets revealed that 75% to 80% of samples exhibited no co-occurring regulated pollutants surpassing MCL values, whereas imputed values showed only 15% to 55% of the samples posed no health risks. Imputed data unveiled a significant increase, ranging from 2 to 5 times, in the number of sampling locations predicted to potentially exceed health-based limits and identified samples where 2 to 6 co-occurring chemicals may occur and surpass health-based levels. Linking imputed data to sampling locations can pinpoint potential hotspots of elevated chemical levels and guide optimal resource allocation for additional field sampling and chemical analysis. With this approach, further analysis of complete data sets allows state agencies authorized to conduct groundwater monitoring, often with limited financial resources, to prioritize sampling locations and chemicals to be tested. Given existing data and time constraints, it is crucial to identify the most strategic use of the available resources to address data gaps effectively. This work establishes a framework to enhance the beneficial impact of funding groundwater data collection by reducing uncertainty in prioritizing future sampling locations and chemical analyses. |
doi_str_mv | 10.1021/acs.est.4c05203 |
format | article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11580165</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3128318968</sourcerecordid><originalsourceid>FETCH-LOGICAL-a314t-8443393010103b3cc38a14e74c0453526d8f0e7f90e1f84e8c98d2ecd09f1dd83</originalsourceid><addsrcrecordid>eNp1kUFrGzEQhUVoSNy059yCjoWwzmi162hPwThtakgIhBR6E7I0ayvZlVxJm-D--srYDe0h6DAgvfdGMx8hpwzGDEp2oXQcY0zjSkNdAj8gI5ZrUYuafSAjAMaLhk9-HpOPMT4BQMlBHJFj3tTQ8ApG5Pfd0CW77pBeq6TovF8PSSXrHb3DtPIm0ql5UU4jfbDxmU6d6jbRRqqcoY8Bs2dhO5s21Ld05guv9RCCdUs6dz4slbOazlbYW626SK2jN8EPzryqhOETOWzzLX7e1xPy49vXx9n34vb-Zj6b3haKsyoVoqo4bziwfPiCa82FYhVe5pGrmtflxIgW8LJtAFkrKhS6EaZEbaBpmTGCn5CrXe56WPRoNLoUVCfXwfYqbKRXVv7_4uxKLv2LZKwWwCZ1TviyTwj-15D3LXsbNXadcuiHKDkrBWeimWybXeykOvgYA7ZvfRjILTKZkcltxB5Zdpz9-703_V9GWXC-E2ydT34ImUF8N-4PIImkMA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3128318968</pqid></control><display><type>article</type><title>Multiple Data Imputation Methods Advance Risk Analysis and Treatability of Co-occurring Inorganic Chemicals in Groundwater</title><source>American Chemical Society:Jisc Collections:American Chemical Society Read & Publish Agreement 2022-2024 (Reading list)</source><creator>Mahmood, Akhlak U. ; Islam, Minhazul ; Gulyuk, Alexey V. ; Briese, Emily ; Velasco, Carmen A. ; Malu, Mohit ; Sharma, Naushita ; Spanias, Andreas ; Yingling, Yaroslava G. ; Westerhoff, Paul</creator><creatorcontrib>Mahmood, Akhlak U. ; Islam, Minhazul ; Gulyuk, Alexey V. ; Briese, Emily ; Velasco, Carmen A. ; Malu, Mohit ; Sharma, Naushita ; Spanias, Andreas ; Yingling, Yaroslava G. ; Westerhoff, Paul</creatorcontrib><description>Accurately assessing and managing risks associated with inorganic pollutants in groundwater is imperative. Historic water quality databases are often sparse due to rationale or financial budgets for sample collection and analysis, posing challenges in evaluating exposure or water treatment effectiveness. We utilized and compared two advanced multiple data imputation techniques, AMELIA and MICE algorithms, to fill gaps in sparse groundwater quality data sets. AMELIA outperformed MICE in handling missing values, as MICE tended to overestimate certain values, resulting in more outliers. Field data sets revealed that 75% to 80% of samples exhibited no co-occurring regulated pollutants surpassing MCL values, whereas imputed values showed only 15% to 55% of the samples posed no health risks. Imputed data unveiled a significant increase, ranging from 2 to 5 times, in the number of sampling locations predicted to potentially exceed health-based limits and identified samples where 2 to 6 co-occurring chemicals may occur and surpass health-based levels. Linking imputed data to sampling locations can pinpoint potential hotspots of elevated chemical levels and guide optimal resource allocation for additional field sampling and chemical analysis. With this approach, further analysis of complete data sets allows state agencies authorized to conduct groundwater monitoring, often with limited financial resources, to prioritize sampling locations and chemicals to be tested. Given existing data and time constraints, it is crucial to identify the most strategic use of the available resources to address data gaps effectively. This work establishes a framework to enhance the beneficial impact of funding groundwater data collection by reducing uncertainty in prioritizing future sampling locations and chemical analyses.</description><identifier>ISSN: 0013-936X</identifier><identifier>ISSN: 1520-5851</identifier><identifier>EISSN: 1520-5851</identifier><identifier>DOI: 10.1021/acs.est.4c05203</identifier><identifier>PMID: 39509340</identifier><language>eng</language><publisher>United States: American Chemical Society</publisher><subject>Data Science</subject><ispartof>Environmental science & technology, 2024-11, Vol.58 (46), p.20513-20524</ispartof><rights>2024 The Authors. Published by American Chemical Society</rights><rights>2024 The Authors. Published by American Chemical Society 2024 The Authors</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a314t-8443393010103b3cc38a14e74c0453526d8f0e7f90e1f84e8c98d2ecd09f1dd83</cites><orcidid>0000-0002-5607-2885 ; 0000-0002-9241-8759 ; 0000-0002-9924-8713 ; 0000-0002-8557-9992</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,780,784,885,27924,27925</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/39509340$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Mahmood, Akhlak U.</creatorcontrib><creatorcontrib>Islam, Minhazul</creatorcontrib><creatorcontrib>Gulyuk, Alexey V.</creatorcontrib><creatorcontrib>Briese, Emily</creatorcontrib><creatorcontrib>Velasco, Carmen A.</creatorcontrib><creatorcontrib>Malu, Mohit</creatorcontrib><creatorcontrib>Sharma, Naushita</creatorcontrib><creatorcontrib>Spanias, Andreas</creatorcontrib><creatorcontrib>Yingling, Yaroslava G.</creatorcontrib><creatorcontrib>Westerhoff, Paul</creatorcontrib><title>Multiple Data Imputation Methods Advance Risk Analysis and Treatability of Co-occurring Inorganic Chemicals in Groundwater</title><title>Environmental science & technology</title><addtitle>Environ. Sci. Technol</addtitle><description>Accurately assessing and managing risks associated with inorganic pollutants in groundwater is imperative. Historic water quality databases are often sparse due to rationale or financial budgets for sample collection and analysis, posing challenges in evaluating exposure or water treatment effectiveness. We utilized and compared two advanced multiple data imputation techniques, AMELIA and MICE algorithms, to fill gaps in sparse groundwater quality data sets. AMELIA outperformed MICE in handling missing values, as MICE tended to overestimate certain values, resulting in more outliers. Field data sets revealed that 75% to 80% of samples exhibited no co-occurring regulated pollutants surpassing MCL values, whereas imputed values showed only 15% to 55% of the samples posed no health risks. Imputed data unveiled a significant increase, ranging from 2 to 5 times, in the number of sampling locations predicted to potentially exceed health-based limits and identified samples where 2 to 6 co-occurring chemicals may occur and surpass health-based levels. Linking imputed data to sampling locations can pinpoint potential hotspots of elevated chemical levels and guide optimal resource allocation for additional field sampling and chemical analysis. With this approach, further analysis of complete data sets allows state agencies authorized to conduct groundwater monitoring, often with limited financial resources, to prioritize sampling locations and chemicals to be tested. Given existing data and time constraints, it is crucial to identify the most strategic use of the available resources to address data gaps effectively. This work establishes a framework to enhance the beneficial impact of funding groundwater data collection by reducing uncertainty in prioritizing future sampling locations and chemical analyses.</description><subject>Data Science</subject><issn>0013-936X</issn><issn>1520-5851</issn><issn>1520-5851</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp1kUFrGzEQhUVoSNy059yCjoWwzmi162hPwThtakgIhBR6E7I0ayvZlVxJm-D--srYDe0h6DAgvfdGMx8hpwzGDEp2oXQcY0zjSkNdAj8gI5ZrUYuafSAjAMaLhk9-HpOPMT4BQMlBHJFj3tTQ8ApG5Pfd0CW77pBeq6TovF8PSSXrHb3DtPIm0ql5UU4jfbDxmU6d6jbRRqqcoY8Bs2dhO5s21Ld05guv9RCCdUs6dz4slbOazlbYW626SK2jN8EPzryqhOETOWzzLX7e1xPy49vXx9n34vb-Zj6b3haKsyoVoqo4bziwfPiCa82FYhVe5pGrmtflxIgW8LJtAFkrKhS6EaZEbaBpmTGCn5CrXe56WPRoNLoUVCfXwfYqbKRXVv7_4uxKLv2LZKwWwCZ1TviyTwj-15D3LXsbNXadcuiHKDkrBWeimWybXeykOvgYA7ZvfRjILTKZkcltxB5Zdpz9-703_V9GWXC-E2ydT34ImUF8N-4PIImkMA</recordid><startdate>20241119</startdate><enddate>20241119</enddate><creator>Mahmood, Akhlak U.</creator><creator>Islam, Minhazul</creator><creator>Gulyuk, Alexey V.</creator><creator>Briese, Emily</creator><creator>Velasco, Carmen A.</creator><creator>Malu, Mohit</creator><creator>Sharma, Naushita</creator><creator>Spanias, Andreas</creator><creator>Yingling, Yaroslava G.</creator><creator>Westerhoff, Paul</creator><general>American Chemical Society</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-5607-2885</orcidid><orcidid>https://orcid.org/0000-0002-9241-8759</orcidid><orcidid>https://orcid.org/0000-0002-9924-8713</orcidid><orcidid>https://orcid.org/0000-0002-8557-9992</orcidid></search><sort><creationdate>20241119</creationdate><title>Multiple Data Imputation Methods Advance Risk Analysis and Treatability of Co-occurring Inorganic Chemicals in Groundwater</title><author>Mahmood, Akhlak U. ; Islam, Minhazul ; Gulyuk, Alexey V. ; Briese, Emily ; Velasco, Carmen A. ; Malu, Mohit ; Sharma, Naushita ; Spanias, Andreas ; Yingling, Yaroslava G. ; Westerhoff, Paul</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a314t-8443393010103b3cc38a14e74c0453526d8f0e7f90e1f84e8c98d2ecd09f1dd83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Data Science</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mahmood, Akhlak U.</creatorcontrib><creatorcontrib>Islam, Minhazul</creatorcontrib><creatorcontrib>Gulyuk, Alexey V.</creatorcontrib><creatorcontrib>Briese, Emily</creatorcontrib><creatorcontrib>Velasco, Carmen A.</creatorcontrib><creatorcontrib>Malu, Mohit</creatorcontrib><creatorcontrib>Sharma, Naushita</creatorcontrib><creatorcontrib>Spanias, Andreas</creatorcontrib><creatorcontrib>Yingling, Yaroslava G.</creatorcontrib><creatorcontrib>Westerhoff, Paul</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Environmental science & technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mahmood, Akhlak U.</au><au>Islam, Minhazul</au><au>Gulyuk, Alexey V.</au><au>Briese, Emily</au><au>Velasco, Carmen A.</au><au>Malu, Mohit</au><au>Sharma, Naushita</au><au>Spanias, Andreas</au><au>Yingling, Yaroslava G.</au><au>Westerhoff, Paul</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Multiple Data Imputation Methods Advance Risk Analysis and Treatability of Co-occurring Inorganic Chemicals in Groundwater</atitle><jtitle>Environmental science & technology</jtitle><addtitle>Environ. Sci. Technol</addtitle><date>2024-11-19</date><risdate>2024</risdate><volume>58</volume><issue>46</issue><spage>20513</spage><epage>20524</epage><pages>20513-20524</pages><issn>0013-936X</issn><issn>1520-5851</issn><eissn>1520-5851</eissn><abstract>Accurately assessing and managing risks associated with inorganic pollutants in groundwater is imperative. Historic water quality databases are often sparse due to rationale or financial budgets for sample collection and analysis, posing challenges in evaluating exposure or water treatment effectiveness. We utilized and compared two advanced multiple data imputation techniques, AMELIA and MICE algorithms, to fill gaps in sparse groundwater quality data sets. AMELIA outperformed MICE in handling missing values, as MICE tended to overestimate certain values, resulting in more outliers. Field data sets revealed that 75% to 80% of samples exhibited no co-occurring regulated pollutants surpassing MCL values, whereas imputed values showed only 15% to 55% of the samples posed no health risks. Imputed data unveiled a significant increase, ranging from 2 to 5 times, in the number of sampling locations predicted to potentially exceed health-based limits and identified samples where 2 to 6 co-occurring chemicals may occur and surpass health-based levels. Linking imputed data to sampling locations can pinpoint potential hotspots of elevated chemical levels and guide optimal resource allocation for additional field sampling and chemical analysis. With this approach, further analysis of complete data sets allows state agencies authorized to conduct groundwater monitoring, often with limited financial resources, to prioritize sampling locations and chemicals to be tested. Given existing data and time constraints, it is crucial to identify the most strategic use of the available resources to address data gaps effectively. This work establishes a framework to enhance the beneficial impact of funding groundwater data collection by reducing uncertainty in prioritizing future sampling locations and chemical analyses.</abstract><cop>United States</cop><pub>American Chemical Society</pub><pmid>39509340</pmid><doi>10.1021/acs.est.4c05203</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0002-5607-2885</orcidid><orcidid>https://orcid.org/0000-0002-9241-8759</orcidid><orcidid>https://orcid.org/0000-0002-9924-8713</orcidid><orcidid>https://orcid.org/0000-0002-8557-9992</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0013-936X |
ispartof | Environmental science & technology, 2024-11, Vol.58 (46), p.20513-20524 |
issn | 0013-936X 1520-5851 1520-5851 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11580165 |
source | American Chemical Society:Jisc Collections:American Chemical Society Read & Publish Agreement 2022-2024 (Reading list) |
subjects | Data Science |
title | Multiple Data Imputation Methods Advance Risk Analysis and Treatability of Co-occurring Inorganic Chemicals in Groundwater |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T14%3A36%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Multiple%20Data%20Imputation%20Methods%20Advance%20Risk%20Analysis%20and%20Treatability%20of%20Co-occurring%20Inorganic%20Chemicals%20in%20Groundwater&rft.jtitle=Environmental%20science%20&%20technology&rft.au=Mahmood,%20Akhlak%20U.&rft.date=2024-11-19&rft.volume=58&rft.issue=46&rft.spage=20513&rft.epage=20524&rft.pages=20513-20524&rft.issn=0013-936X&rft.eissn=1520-5851&rft_id=info:doi/10.1021/acs.est.4c05203&rft_dat=%3Cproquest_pubme%3E3128318968%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a314t-8443393010103b3cc38a14e74c0453526d8f0e7f90e1f84e8c98d2ecd09f1dd83%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3128318968&rft_id=info:pmid/39509340&rfr_iscdi=true |