Loading…
An evaluation framework for input variable selection algorithms for environmental data-driven models
Input Variable Selection (IVS) is an essential step in the development of data-driven models and is particularly relevant in environmental modelling. While new methods for identifying important model inputs continue to emerge, each has its own advantages and limitations and no single method is best...
Saved in:
Published in: | Environmental modelling & software : with environment data news 2014-12, Vol.62, p.33-51 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c452t-f075406786fb84692d681fecd962573e14ef2e6034d754b87e865e69c7d174053 |
---|---|
cites | cdi_FETCH-LOGICAL-c452t-f075406786fb84692d681fecd962573e14ef2e6034d754b87e865e69c7d174053 |
container_end_page | 51 |
container_issue | |
container_start_page | 33 |
container_title | Environmental modelling & software : with environment data news |
container_volume | 62 |
creator | Galelli, Stefano Humphrey, Greer B. Maier, Holger R. Castelletti, Andrea Dandy, Graeme C. Gibbs, Matthew S. |
description | Input Variable Selection (IVS) is an essential step in the development of data-driven models and is particularly relevant in environmental modelling. While new methods for identifying important model inputs continue to emerge, each has its own advantages and limitations and no single method is best suited to all datasets and modelling purposes. Rigorous evaluation of new and existing input variable selection methods would allow the effectiveness of these algorithms to be properly identified in various circumstances. However, such evaluations are largely neglected due to the lack of guidelines or precedent to facilitate consistent and standardised assessment. In this paper, a new framework is proposed for the evaluation and inter-comparison of IVS methods which takes into account: (1) a wide range of dataset properties that are relevant to real world environmental data, (2) assessment criteria selected to highlight algorithm suitability in different situations of interest, and (3) a website for sharing data, algorithms and results (http://ivs4em.deib.polimi.it/). The framework is demonstrated on four IVS algorithms commonly used in environmental modelling studies and twenty-six datasets exhibiting different typical properties of environmental data. The main aim at this stage is to demonstrate the application of the proposed evaluation framework, rather than provide a definitive answer as to which of these algorithms has the best overall performance. Nevertheless, the results indicate interesting differences in the algorithms' performance that have not been identified previously.
•A framework for the evaluation of input variable selection algorithms is proposed.•The framework consists of assessment criteria and twenty-six datasets.•The framework is supported by a dedicated website (http://ivs4em.deib.polimi.it).•Four popular IVS algorithms are considered for evaluation purposes. |
doi_str_mv | 10.1016/j.envsoft.2014.08.015 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1660044575</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1364815214002394</els_id><sourcerecordid>1660044575</sourcerecordid><originalsourceid>FETCH-LOGICAL-c452t-f075406786fb84692d681fecd962573e14ef2e6034d754b87e865e69c7d174053</originalsourceid><addsrcrecordid>eNqNkc1uFDEQhEcoSCSBR0CaC1IuM7Q9_tsTiqIkIEXiAmfLa7fBi8de7NmJeHuc7IprOHUfvupSV3XdewIjASI-7kZMa81-GSkQNoIagfBX3TlRchqEpOKs7ZNggyKcvukuat0BNISy885dpx5XEw9mCTn1vpgZH3P51ftc-pD2h6VfTQlmG7GvGNE-Yyb-yCUsP-f6zDX7UHKaMS0m9s4sZnAlrJj6OTuM9W332ptY8d1pXnbf726_3XweHr7ef7m5fhgs43QZPEjOQEgl_FYxsaFOKOLRuo2gXE5IGHqKAibmGrhVEpXgKDZWOiIZ8Omyuzre3Zf8-4B10XOoFmM0CfOhaiIEAGNc_g_KJJCJCdFQfkRtybUW9HpfwmzKH01APxWgd_pUgH4qQIPSLd2m-3CyMNWa2KJNNtR_YrqBSbZvG_fpyLWkcA1YdLUBk0UXSstbuxxecPoLC7ifSQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1647013466</pqid></control><display><type>article</type><title>An evaluation framework for input variable selection algorithms for environmental data-driven models</title><source>ScienceDirect Freedom Collection</source><creator>Galelli, Stefano ; Humphrey, Greer B. ; Maier, Holger R. ; Castelletti, Andrea ; Dandy, Graeme C. ; Gibbs, Matthew S.</creator><creatorcontrib>Galelli, Stefano ; Humphrey, Greer B. ; Maier, Holger R. ; Castelletti, Andrea ; Dandy, Graeme C. ; Gibbs, Matthew S.</creatorcontrib><description>Input Variable Selection (IVS) is an essential step in the development of data-driven models and is particularly relevant in environmental modelling. While new methods for identifying important model inputs continue to emerge, each has its own advantages and limitations and no single method is best suited to all datasets and modelling purposes. Rigorous evaluation of new and existing input variable selection methods would allow the effectiveness of these algorithms to be properly identified in various circumstances. However, such evaluations are largely neglected due to the lack of guidelines or precedent to facilitate consistent and standardised assessment. In this paper, a new framework is proposed for the evaluation and inter-comparison of IVS methods which takes into account: (1) a wide range of dataset properties that are relevant to real world environmental data, (2) assessment criteria selected to highlight algorithm suitability in different situations of interest, and (3) a website for sharing data, algorithms and results (http://ivs4em.deib.polimi.it/). The framework is demonstrated on four IVS algorithms commonly used in environmental modelling studies and twenty-six datasets exhibiting different typical properties of environmental data. The main aim at this stage is to demonstrate the application of the proposed evaluation framework, rather than provide a definitive answer as to which of these algorithms has the best overall performance. Nevertheless, the results indicate interesting differences in the algorithms' performance that have not been identified previously.
•A framework for the evaluation of input variable selection algorithms is proposed.•The framework consists of assessment criteria and twenty-six datasets.•The framework is supported by a dedicated website (http://ivs4em.deib.polimi.it).•Four popular IVS algorithms are considered for evaluation purposes.</description><identifier>ISSN: 1364-8152</identifier><identifier>EISSN: 1873-6726</identifier><identifier>DOI: 10.1016/j.envsoft.2014.08.015</identifier><language>eng</language><publisher>Oxford: Elsevier Ltd</publisher><subject>Algorithms ; Animal, plant and microbial ecology ; Artificial neural networks ; Assessments ; Biological and medical sciences ; Computer programs ; Data-driven modelling ; Evaluation framework ; Fundamental and applied biological sciences. Psychology ; General aspects. Techniques ; Guidelines ; Input variable selection ; Large environmental datasets ; Mathematical models ; Methods and techniques (sampling, tagging, trapping, modelling...) ; Modelling ; Software</subject><ispartof>Environmental modelling & software : with environment data news, 2014-12, Vol.62, p.33-51</ispartof><rights>2014 Elsevier Ltd</rights><rights>2015 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c452t-f075406786fb84692d681fecd962573e14ef2e6034d754b87e865e69c7d174053</citedby><cites>FETCH-LOGICAL-c452t-f075406786fb84692d681fecd962573e14ef2e6034d754b87e865e69c7d174053</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=29037067$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Galelli, Stefano</creatorcontrib><creatorcontrib>Humphrey, Greer B.</creatorcontrib><creatorcontrib>Maier, Holger R.</creatorcontrib><creatorcontrib>Castelletti, Andrea</creatorcontrib><creatorcontrib>Dandy, Graeme C.</creatorcontrib><creatorcontrib>Gibbs, Matthew S.</creatorcontrib><title>An evaluation framework for input variable selection algorithms for environmental data-driven models</title><title>Environmental modelling & software : with environment data news</title><description>Input Variable Selection (IVS) is an essential step in the development of data-driven models and is particularly relevant in environmental modelling. While new methods for identifying important model inputs continue to emerge, each has its own advantages and limitations and no single method is best suited to all datasets and modelling purposes. Rigorous evaluation of new and existing input variable selection methods would allow the effectiveness of these algorithms to be properly identified in various circumstances. However, such evaluations are largely neglected due to the lack of guidelines or precedent to facilitate consistent and standardised assessment. In this paper, a new framework is proposed for the evaluation and inter-comparison of IVS methods which takes into account: (1) a wide range of dataset properties that are relevant to real world environmental data, (2) assessment criteria selected to highlight algorithm suitability in different situations of interest, and (3) a website for sharing data, algorithms and results (http://ivs4em.deib.polimi.it/). The framework is demonstrated on four IVS algorithms commonly used in environmental modelling studies and twenty-six datasets exhibiting different typical properties of environmental data. The main aim at this stage is to demonstrate the application of the proposed evaluation framework, rather than provide a definitive answer as to which of these algorithms has the best overall performance. Nevertheless, the results indicate interesting differences in the algorithms' performance that have not been identified previously.
•A framework for the evaluation of input variable selection algorithms is proposed.•The framework consists of assessment criteria and twenty-six datasets.•The framework is supported by a dedicated website (http://ivs4em.deib.polimi.it).•Four popular IVS algorithms are considered for evaluation purposes.</description><subject>Algorithms</subject><subject>Animal, plant and microbial ecology</subject><subject>Artificial neural networks</subject><subject>Assessments</subject><subject>Biological and medical sciences</subject><subject>Computer programs</subject><subject>Data-driven modelling</subject><subject>Evaluation framework</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects. Techniques</subject><subject>Guidelines</subject><subject>Input variable selection</subject><subject>Large environmental datasets</subject><subject>Mathematical models</subject><subject>Methods and techniques (sampling, tagging, trapping, modelling...)</subject><subject>Modelling</subject><subject>Software</subject><issn>1364-8152</issn><issn>1873-6726</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><recordid>eNqNkc1uFDEQhEcoSCSBR0CaC1IuM7Q9_tsTiqIkIEXiAmfLa7fBi8de7NmJeHuc7IprOHUfvupSV3XdewIjASI-7kZMa81-GSkQNoIagfBX3TlRchqEpOKs7ZNggyKcvukuat0BNISy885dpx5XEw9mCTn1vpgZH3P51ftc-pD2h6VfTQlmG7GvGNE-Yyb-yCUsP-f6zDX7UHKaMS0m9s4sZnAlrJj6OTuM9W332ptY8d1pXnbf726_3XweHr7ef7m5fhgs43QZPEjOQEgl_FYxsaFOKOLRuo2gXE5IGHqKAibmGrhVEpXgKDZWOiIZ8Omyuzre3Zf8-4B10XOoFmM0CfOhaiIEAGNc_g_KJJCJCdFQfkRtybUW9HpfwmzKH01APxWgd_pUgH4qQIPSLd2m-3CyMNWa2KJNNtR_YrqBSbZvG_fpyLWkcA1YdLUBk0UXSstbuxxecPoLC7ifSQ</recordid><startdate>20141201</startdate><enddate>20141201</enddate><creator>Galelli, Stefano</creator><creator>Humphrey, Greer B.</creator><creator>Maier, Holger R.</creator><creator>Castelletti, Andrea</creator><creator>Dandy, Graeme C.</creator><creator>Gibbs, Matthew S.</creator><general>Elsevier Ltd</general><general>Elsevier</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QH</scope><scope>7ST</scope><scope>7UA</scope><scope>C1K</scope><scope>F1W</scope><scope>H97</scope><scope>L.G</scope><scope>SOI</scope><scope>7SC</scope><scope>7SU</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20141201</creationdate><title>An evaluation framework for input variable selection algorithms for environmental data-driven models</title><author>Galelli, Stefano ; Humphrey, Greer B. ; Maier, Holger R. ; Castelletti, Andrea ; Dandy, Graeme C. ; Gibbs, Matthew S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c452t-f075406786fb84692d681fecd962573e14ef2e6034d754b87e865e69c7d174053</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Algorithms</topic><topic>Animal, plant and microbial ecology</topic><topic>Artificial neural networks</topic><topic>Assessments</topic><topic>Biological and medical sciences</topic><topic>Computer programs</topic><topic>Data-driven modelling</topic><topic>Evaluation framework</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects. Techniques</topic><topic>Guidelines</topic><topic>Input variable selection</topic><topic>Large environmental datasets</topic><topic>Mathematical models</topic><topic>Methods and techniques (sampling, tagging, trapping, modelling...)</topic><topic>Modelling</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Galelli, Stefano</creatorcontrib><creatorcontrib>Humphrey, Greer B.</creatorcontrib><creatorcontrib>Maier, Holger R.</creatorcontrib><creatorcontrib>Castelletti, Andrea</creatorcontrib><creatorcontrib>Dandy, Graeme C.</creatorcontrib><creatorcontrib>Gibbs, Matthew S.</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Aqualine</collection><collection>Environment Abstracts</collection><collection>Water Resources Abstracts</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) 3: Aquatic Pollution & Environmental Quality</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) Professional</collection><collection>Environment Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Environmental Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Environmental modelling & software : with environment data news</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Galelli, Stefano</au><au>Humphrey, Greer B.</au><au>Maier, Holger R.</au><au>Castelletti, Andrea</au><au>Dandy, Graeme C.</au><au>Gibbs, Matthew S.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An evaluation framework for input variable selection algorithms for environmental data-driven models</atitle><jtitle>Environmental modelling & software : with environment data news</jtitle><date>2014-12-01</date><risdate>2014</risdate><volume>62</volume><spage>33</spage><epage>51</epage><pages>33-51</pages><issn>1364-8152</issn><eissn>1873-6726</eissn><abstract>Input Variable Selection (IVS) is an essential step in the development of data-driven models and is particularly relevant in environmental modelling. While new methods for identifying important model inputs continue to emerge, each has its own advantages and limitations and no single method is best suited to all datasets and modelling purposes. Rigorous evaluation of new and existing input variable selection methods would allow the effectiveness of these algorithms to be properly identified in various circumstances. However, such evaluations are largely neglected due to the lack of guidelines or precedent to facilitate consistent and standardised assessment. In this paper, a new framework is proposed for the evaluation and inter-comparison of IVS methods which takes into account: (1) a wide range of dataset properties that are relevant to real world environmental data, (2) assessment criteria selected to highlight algorithm suitability in different situations of interest, and (3) a website for sharing data, algorithms and results (http://ivs4em.deib.polimi.it/). The framework is demonstrated on four IVS algorithms commonly used in environmental modelling studies and twenty-six datasets exhibiting different typical properties of environmental data. The main aim at this stage is to demonstrate the application of the proposed evaluation framework, rather than provide a definitive answer as to which of these algorithms has the best overall performance. Nevertheless, the results indicate interesting differences in the algorithms' performance that have not been identified previously.
•A framework for the evaluation of input variable selection algorithms is proposed.•The framework consists of assessment criteria and twenty-six datasets.•The framework is supported by a dedicated website (http://ivs4em.deib.polimi.it).•Four popular IVS algorithms are considered for evaluation purposes.</abstract><cop>Oxford</cop><pub>Elsevier Ltd</pub><doi>10.1016/j.envsoft.2014.08.015</doi><tpages>19</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1364-8152 |
ispartof | Environmental modelling & software : with environment data news, 2014-12, Vol.62, p.33-51 |
issn | 1364-8152 1873-6726 |
language | eng |
recordid | cdi_proquest_miscellaneous_1660044575 |
source | ScienceDirect Freedom Collection |
subjects | Algorithms Animal, plant and microbial ecology Artificial neural networks Assessments Biological and medical sciences Computer programs Data-driven modelling Evaluation framework Fundamental and applied biological sciences. Psychology General aspects. Techniques Guidelines Input variable selection Large environmental datasets Mathematical models Methods and techniques (sampling, tagging, trapping, modelling...) Modelling Software |
title | An evaluation framework for input variable selection algorithms for environmental data-driven models |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T17%3A00%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20evaluation%20framework%20for%20input%20variable%20selection%20algorithms%20for%20environmental%20data-driven%20models&rft.jtitle=Environmental%20modelling%20&%20software%20:%20with%20environment%20data%20news&rft.au=Galelli,%20Stefano&rft.date=2014-12-01&rft.volume=62&rft.spage=33&rft.epage=51&rft.pages=33-51&rft.issn=1364-8152&rft.eissn=1873-6726&rft_id=info:doi/10.1016/j.envsoft.2014.08.015&rft_dat=%3Cproquest_cross%3E1660044575%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c452t-f075406786fb84692d681fecd962573e14ef2e6034d754b87e865e69c7d174053%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1647013466&rft_id=info:pmid/&rfr_iscdi=true |