Loading…

Validity evaluation of a machine-learning model for chlorophyll a retrieval using Sentinel-2 from inland and coastal waters

[Display omitted] •ML models were developed for Chla retrieval for inland and coastal waters using MSI.•Light gradient boosting machine (LGBM) outperformed other ML algorithms.•Post-hoc explanations to LGBM were provided using SHAP.•Rrs(704)/Rrs(665) was the most important input feature.•Percent for...

Full description

Saved in:
Bibliographic Details
Published in:Ecological indicators 2022-04, Vol.137, p.108737, Article 108737
Main Authors: Woo Kim, Young, Kim, TaeHo, Shin, Jihoon, Lee, Dae-Seong, Park, Young-Seuk, Kim, Yeji, Cha, YoonKyung
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c422t-a215e8e5c203391b7517fe0c9174734cf6b7dee5083b18837993082e58f8fa453
cites cdi_FETCH-LOGICAL-c422t-a215e8e5c203391b7517fe0c9174734cf6b7dee5083b18837993082e58f8fa453
container_end_page
container_issue
container_start_page 108737
container_title Ecological indicators
container_volume 137
creator Woo Kim, Young
Kim, TaeHo
Shin, Jihoon
Lee, Dae-Seong
Park, Young-Seuk
Kim, Yeji
Cha, YoonKyung
description [Display omitted] •ML models were developed for Chla retrieval for inland and coastal waters using MSI.•Light gradient boosting machine (LGBM) outperformed other ML algorithms.•Post-hoc explanations to LGBM were provided using SHAP.•Rrs(704)/Rrs(665) was the most important input feature.•Percent forest within the 500-m buffer zone explained among-lake Chla variations. The MultiSpectral Instrument (MSI) on-board Sentinel-2 provides satellite images at spatiotemporal resolutions suitable for chlorophyll a (Chla) retrieval from inland and coastal waters. Machine-learning (ML) algorithms including light gradient boosting machine (LGBM) were employed for Chl a retrieval from MSI. The study area encompasses 78 lakes and estuaries located across four major river watersheds in South Korea. Matchup data between MSI overpass and near-concurrent in situ Chl a measurements from December 2018 to April 2021 were included. The remote sensing reflectance (Rrs) values of six single spectral bands and four two-band ratios were used as the input features. Despite the difficulty in Chla estimation in optically complex waters, ML algorithms showed overall reasonable accuracy. Among the ML algorithms, LGBM exhibited the best performance (R2 = 0.75, bias = -0.15, slope = 0.73, RMSE = 15.15 mg·m-3, MAE = 9.49 mg·m-3) over a wide range of trophic states. Post-hoc interpretations of the best performing LGBM using Shapley additive explanations indicated that Rrs(704)/Rrs(665) was the most important feature, while Rrs(739)/Rrs(704) and Rrs(492)/Rrs(560) played auxiliary roles in Chl a retrieval through interaction with Rrs(704)/Rrs(665). Among-lake spatial variations of Chla were explained by percent forest and agricultural area within the buffer zone at multiple scales (buffer widths of 50 m and 500 m). The associations between the modeled Chla and buffer land cover types, that is, increase in Chla concentration with increase in percent forest and decrease in percent agricultural area, were consistent with the established ecological knowledge. Overall, the model interpretations and spatial variations in Chla within and among lakes confirmed the validity of LGBM for retrieving MSI-derived Chla from lakes and estuaries. Our study can serve as the reference for evaluating the validity of complex ML models for inland water remote sensing.
doi_str_mv 10.1016/j.ecolind.2022.108737
format article
fullrecord <record><control><sourceid>elsevier_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_2ed5c698b2654d86bee885cbde9ed5a6</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1470160X22002084</els_id><doaj_id>oai_doaj_org_article_2ed5c698b2654d86bee885cbde9ed5a6</doaj_id><sourcerecordid>S1470160X22002084</sourcerecordid><originalsourceid>FETCH-LOGICAL-c422t-a215e8e5c203391b7517fe0c9174734cf6b7dee5083b18837993082e58f8fa453</originalsourceid><addsrcrecordid>eNqFkcuKHCEUhouQQCaTPELAF6iOl7K0ViEMuQwMZJEL2ckpPU7b2DqoM6HJy4-VHrLNQpTj_30o_zC8ZXTHKJvfHXZocwzJ7TjlvM-0EurZcMG04qOiYnrez5OiI5vpr5fDq1oPtHPLMl8Mf35CDC60E8EHiPfQQk4kewLkCHYfEo4RoaSQbskxO4zE50LsPuaS7_anGHuwYCtho8l93XLfMLUOxpETX_KRhBQhObItm6G2HvwNDUt9PbzwECu-edovhx-fPn6_-jLefP18ffXhZrQT520EziRqlJZTIRa2KsmUR2oXpiYlJuvnVTlESbVYmdZCLYugmqPUXnuYpLgcrs9el-Fg7ko4QjmZDMH8HeRya6C0YCMajk7aedErn-Xk9Lwiai3t6nDpNzB3lzy7bMm1FvT_fIyarQ1zME9tmK0Nc26jc-_PHPaPPgQsptqAyaILBW3rLwn_MTwCQcqXoA</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Validity evaluation of a machine-learning model for chlorophyll a retrieval using Sentinel-2 from inland and coastal waters</title><source>ScienceDirect Freedom Collection</source><creator>Woo Kim, Young ; Kim, TaeHo ; Shin, Jihoon ; Lee, Dae-Seong ; Park, Young-Seuk ; Kim, Yeji ; Cha, YoonKyung</creator><creatorcontrib>Woo Kim, Young ; Kim, TaeHo ; Shin, Jihoon ; Lee, Dae-Seong ; Park, Young-Seuk ; Kim, Yeji ; Cha, YoonKyung</creatorcontrib><description>[Display omitted] •ML models were developed for Chla retrieval for inland and coastal waters using MSI.•Light gradient boosting machine (LGBM) outperformed other ML algorithms.•Post-hoc explanations to LGBM were provided using SHAP.•Rrs(704)/Rrs(665) was the most important input feature.•Percent forest within the 500-m buffer zone explained among-lake Chla variations. The MultiSpectral Instrument (MSI) on-board Sentinel-2 provides satellite images at spatiotemporal resolutions suitable for chlorophyll a (Chla) retrieval from inland and coastal waters. Machine-learning (ML) algorithms including light gradient boosting machine (LGBM) were employed for Chl a retrieval from MSI. The study area encompasses 78 lakes and estuaries located across four major river watersheds in South Korea. Matchup data between MSI overpass and near-concurrent in situ Chl a measurements from December 2018 to April 2021 were included. The remote sensing reflectance (Rrs) values of six single spectral bands and four two-band ratios were used as the input features. Despite the difficulty in Chla estimation in optically complex waters, ML algorithms showed overall reasonable accuracy. Among the ML algorithms, LGBM exhibited the best performance (R2 = 0.75, bias = -0.15, slope = 0.73, RMSE = 15.15 mg·m-3, MAE = 9.49 mg·m-3) over a wide range of trophic states. Post-hoc interpretations of the best performing LGBM using Shapley additive explanations indicated that Rrs(704)/Rrs(665) was the most important feature, while Rrs(739)/Rrs(704) and Rrs(492)/Rrs(560) played auxiliary roles in Chl a retrieval through interaction with Rrs(704)/Rrs(665). Among-lake spatial variations of Chla were explained by percent forest and agricultural area within the buffer zone at multiple scales (buffer widths of 50 m and 500 m). The associations between the modeled Chla and buffer land cover types, that is, increase in Chla concentration with increase in percent forest and decrease in percent agricultural area, were consistent with the established ecological knowledge. Overall, the model interpretations and spatial variations in Chla within and among lakes confirmed the validity of LGBM for retrieving MSI-derived Chla from lakes and estuaries. Our study can serve as the reference for evaluating the validity of complex ML models for inland water remote sensing.</description><identifier>ISSN: 1470-160X</identifier><identifier>EISSN: 1872-7034</identifier><identifier>DOI: 10.1016/j.ecolind.2022.108737</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><subject>Chlorophyll a ; Inland and coastal waters ; Land cover ; Machine learning ; MSI on-board Sentinel-2 ; Multiscale</subject><ispartof>Ecological indicators, 2022-04, Vol.137, p.108737, Article 108737</ispartof><rights>2022 The Authors</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c422t-a215e8e5c203391b7517fe0c9174734cf6b7dee5083b18837993082e58f8fa453</citedby><cites>FETCH-LOGICAL-c422t-a215e8e5c203391b7517fe0c9174734cf6b7dee5083b18837993082e58f8fa453</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Woo Kim, Young</creatorcontrib><creatorcontrib>Kim, TaeHo</creatorcontrib><creatorcontrib>Shin, Jihoon</creatorcontrib><creatorcontrib>Lee, Dae-Seong</creatorcontrib><creatorcontrib>Park, Young-Seuk</creatorcontrib><creatorcontrib>Kim, Yeji</creatorcontrib><creatorcontrib>Cha, YoonKyung</creatorcontrib><title>Validity evaluation of a machine-learning model for chlorophyll a retrieval using Sentinel-2 from inland and coastal waters</title><title>Ecological indicators</title><description>[Display omitted] •ML models were developed for Chla retrieval for inland and coastal waters using MSI.•Light gradient boosting machine (LGBM) outperformed other ML algorithms.•Post-hoc explanations to LGBM were provided using SHAP.•Rrs(704)/Rrs(665) was the most important input feature.•Percent forest within the 500-m buffer zone explained among-lake Chla variations. The MultiSpectral Instrument (MSI) on-board Sentinel-2 provides satellite images at spatiotemporal resolutions suitable for chlorophyll a (Chla) retrieval from inland and coastal waters. Machine-learning (ML) algorithms including light gradient boosting machine (LGBM) were employed for Chl a retrieval from MSI. The study area encompasses 78 lakes and estuaries located across four major river watersheds in South Korea. Matchup data between MSI overpass and near-concurrent in situ Chl a measurements from December 2018 to April 2021 were included. The remote sensing reflectance (Rrs) values of six single spectral bands and four two-band ratios were used as the input features. Despite the difficulty in Chla estimation in optically complex waters, ML algorithms showed overall reasonable accuracy. Among the ML algorithms, LGBM exhibited the best performance (R2 = 0.75, bias = -0.15, slope = 0.73, RMSE = 15.15 mg·m-3, MAE = 9.49 mg·m-3) over a wide range of trophic states. Post-hoc interpretations of the best performing LGBM using Shapley additive explanations indicated that Rrs(704)/Rrs(665) was the most important feature, while Rrs(739)/Rrs(704) and Rrs(492)/Rrs(560) played auxiliary roles in Chl a retrieval through interaction with Rrs(704)/Rrs(665). Among-lake spatial variations of Chla were explained by percent forest and agricultural area within the buffer zone at multiple scales (buffer widths of 50 m and 500 m). The associations between the modeled Chla and buffer land cover types, that is, increase in Chla concentration with increase in percent forest and decrease in percent agricultural area, were consistent with the established ecological knowledge. Overall, the model interpretations and spatial variations in Chla within and among lakes confirmed the validity of LGBM for retrieving MSI-derived Chla from lakes and estuaries. Our study can serve as the reference for evaluating the validity of complex ML models for inland water remote sensing.</description><subject>Chlorophyll a</subject><subject>Inland and coastal waters</subject><subject>Land cover</subject><subject>Machine learning</subject><subject>MSI on-board Sentinel-2</subject><subject>Multiscale</subject><issn>1470-160X</issn><issn>1872-7034</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>DOA</sourceid><recordid>eNqFkcuKHCEUhouQQCaTPELAF6iOl7K0ViEMuQwMZJEL2ckpPU7b2DqoM6HJy4-VHrLNQpTj_30o_zC8ZXTHKJvfHXZocwzJ7TjlvM-0EurZcMG04qOiYnrez5OiI5vpr5fDq1oPtHPLMl8Mf35CDC60E8EHiPfQQk4kewLkCHYfEo4RoaSQbskxO4zE50LsPuaS7_anGHuwYCtho8l93XLfMLUOxpETX_KRhBQhObItm6G2HvwNDUt9PbzwECu-edovhx-fPn6_-jLefP18ffXhZrQT520EziRqlJZTIRa2KsmUR2oXpiYlJuvnVTlESbVYmdZCLYugmqPUXnuYpLgcrs9el-Fg7ko4QjmZDMH8HeRya6C0YCMajk7aedErn-Xk9Lwiai3t6nDpNzB3lzy7bMm1FvT_fIyarQ1zME9tmK0Nc26jc-_PHPaPPgQsptqAyaILBW3rLwn_MTwCQcqXoA</recordid><startdate>202204</startdate><enddate>202204</enddate><creator>Woo Kim, Young</creator><creator>Kim, TaeHo</creator><creator>Shin, Jihoon</creator><creator>Lee, Dae-Seong</creator><creator>Park, Young-Seuk</creator><creator>Kim, Yeji</creator><creator>Cha, YoonKyung</creator><general>Elsevier Ltd</general><general>Elsevier</general><scope>6I.</scope><scope>AAFTH</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>DOA</scope></search><sort><creationdate>202204</creationdate><title>Validity evaluation of a machine-learning model for chlorophyll a retrieval using Sentinel-2 from inland and coastal waters</title><author>Woo Kim, Young ; Kim, TaeHo ; Shin, Jihoon ; Lee, Dae-Seong ; Park, Young-Seuk ; Kim, Yeji ; Cha, YoonKyung</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c422t-a215e8e5c203391b7517fe0c9174734cf6b7dee5083b18837993082e58f8fa453</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Chlorophyll a</topic><topic>Inland and coastal waters</topic><topic>Land cover</topic><topic>Machine learning</topic><topic>MSI on-board Sentinel-2</topic><topic>Multiscale</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Woo Kim, Young</creatorcontrib><creatorcontrib>Kim, TaeHo</creatorcontrib><creatorcontrib>Shin, Jihoon</creatorcontrib><creatorcontrib>Lee, Dae-Seong</creatorcontrib><creatorcontrib>Park, Young-Seuk</creatorcontrib><creatorcontrib>Kim, Yeji</creatorcontrib><creatorcontrib>Cha, YoonKyung</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>CrossRef</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>Ecological indicators</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Woo Kim, Young</au><au>Kim, TaeHo</au><au>Shin, Jihoon</au><au>Lee, Dae-Seong</au><au>Park, Young-Seuk</au><au>Kim, Yeji</au><au>Cha, YoonKyung</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Validity evaluation of a machine-learning model for chlorophyll a retrieval using Sentinel-2 from inland and coastal waters</atitle><jtitle>Ecological indicators</jtitle><date>2022-04</date><risdate>2022</risdate><volume>137</volume><spage>108737</spage><pages>108737-</pages><artnum>108737</artnum><issn>1470-160X</issn><eissn>1872-7034</eissn><abstract>[Display omitted] •ML models were developed for Chla retrieval for inland and coastal waters using MSI.•Light gradient boosting machine (LGBM) outperformed other ML algorithms.•Post-hoc explanations to LGBM were provided using SHAP.•Rrs(704)/Rrs(665) was the most important input feature.•Percent forest within the 500-m buffer zone explained among-lake Chla variations. The MultiSpectral Instrument (MSI) on-board Sentinel-2 provides satellite images at spatiotemporal resolutions suitable for chlorophyll a (Chla) retrieval from inland and coastal waters. Machine-learning (ML) algorithms including light gradient boosting machine (LGBM) were employed for Chl a retrieval from MSI. The study area encompasses 78 lakes and estuaries located across four major river watersheds in South Korea. Matchup data between MSI overpass and near-concurrent in situ Chl a measurements from December 2018 to April 2021 were included. The remote sensing reflectance (Rrs) values of six single spectral bands and four two-band ratios were used as the input features. Despite the difficulty in Chla estimation in optically complex waters, ML algorithms showed overall reasonable accuracy. Among the ML algorithms, LGBM exhibited the best performance (R2 = 0.75, bias = -0.15, slope = 0.73, RMSE = 15.15 mg·m-3, MAE = 9.49 mg·m-3) over a wide range of trophic states. Post-hoc interpretations of the best performing LGBM using Shapley additive explanations indicated that Rrs(704)/Rrs(665) was the most important feature, while Rrs(739)/Rrs(704) and Rrs(492)/Rrs(560) played auxiliary roles in Chl a retrieval through interaction with Rrs(704)/Rrs(665). Among-lake spatial variations of Chla were explained by percent forest and agricultural area within the buffer zone at multiple scales (buffer widths of 50 m and 500 m). The associations between the modeled Chla and buffer land cover types, that is, increase in Chla concentration with increase in percent forest and decrease in percent agricultural area, were consistent with the established ecological knowledge. Overall, the model interpretations and spatial variations in Chla within and among lakes confirmed the validity of LGBM for retrieving MSI-derived Chla from lakes and estuaries. Our study can serve as the reference for evaluating the validity of complex ML models for inland water remote sensing.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/j.ecolind.2022.108737</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1470-160X
ispartof Ecological indicators, 2022-04, Vol.137, p.108737, Article 108737
issn 1470-160X
1872-7034
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_2ed5c698b2654d86bee885cbde9ed5a6
source ScienceDirect Freedom Collection
subjects Chlorophyll a
Inland and coastal waters
Land cover
Machine learning
MSI on-board Sentinel-2
Multiscale
title Validity evaluation of a machine-learning model for chlorophyll a retrieval using Sentinel-2 from inland and coastal waters
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T04%3A49%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-elsevier_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Validity%20evaluation%20of%20a%20machine-learning%20model%20for%20chlorophyll%20a%20retrieval%20using%20Sentinel-2%20from%20inland%20and%20coastal%20waters&rft.jtitle=Ecological%20indicators&rft.au=Woo%20Kim,%20Young&rft.date=2022-04&rft.volume=137&rft.spage=108737&rft.pages=108737-&rft.artnum=108737&rft.issn=1470-160X&rft.eissn=1872-7034&rft_id=info:doi/10.1016/j.ecolind.2022.108737&rft_dat=%3Celsevier_doaj_%3ES1470160X22002084%3C/elsevier_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c422t-a215e8e5c203391b7517fe0c9174734cf6b7dee5083b18837993082e58f8fa453%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true