Loading…

An evaluation framework for input variable selection algorithms for environmental data-driven models

Input Variable Selection (IVS) is an essential step in the development of data-driven models and is particularly relevant in environmental modelling. While new methods for identifying important model inputs continue to emerge, each has its own advantages and limitations and no single method is best...

Full description

Saved in:
Bibliographic Details
Published in:Environmental modelling & software : with environment data news 2014-12, Vol.62, p.33-51
Main Authors: Galelli, Stefano, Humphrey, Greer B., Maier, Holger R., Castelletti, Andrea, Dandy, Graeme C., Gibbs, Matthew S.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c452t-f075406786fb84692d681fecd962573e14ef2e6034d754b87e865e69c7d174053
cites cdi_FETCH-LOGICAL-c452t-f075406786fb84692d681fecd962573e14ef2e6034d754b87e865e69c7d174053
container_end_page 51
container_issue
container_start_page 33
container_title Environmental modelling & software : with environment data news
container_volume 62
creator Galelli, Stefano
Humphrey, Greer B.
Maier, Holger R.
Castelletti, Andrea
Dandy, Graeme C.
Gibbs, Matthew S.
description Input Variable Selection (IVS) is an essential step in the development of data-driven models and is particularly relevant in environmental modelling. While new methods for identifying important model inputs continue to emerge, each has its own advantages and limitations and no single method is best suited to all datasets and modelling purposes. Rigorous evaluation of new and existing input variable selection methods would allow the effectiveness of these algorithms to be properly identified in various circumstances. However, such evaluations are largely neglected due to the lack of guidelines or precedent to facilitate consistent and standardised assessment. In this paper, a new framework is proposed for the evaluation and inter-comparison of IVS methods which takes into account: (1) a wide range of dataset properties that are relevant to real world environmental data, (2) assessment criteria selected to highlight algorithm suitability in different situations of interest, and (3) a website for sharing data, algorithms and results (http://ivs4em.deib.polimi.it/). The framework is demonstrated on four IVS algorithms commonly used in environmental modelling studies and twenty-six datasets exhibiting different typical properties of environmental data. The main aim at this stage is to demonstrate the application of the proposed evaluation framework, rather than provide a definitive answer as to which of these algorithms has the best overall performance. Nevertheless, the results indicate interesting differences in the algorithms' performance that have not been identified previously. •A framework for the evaluation of input variable selection algorithms is proposed.•The framework consists of assessment criteria and twenty-six datasets.•The framework is supported by a dedicated website (http://ivs4em.deib.polimi.it).•Four popular IVS algorithms are considered for evaluation purposes.
doi_str_mv 10.1016/j.envsoft.2014.08.015
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1660044575</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1364815214002394</els_id><sourcerecordid>1660044575</sourcerecordid><originalsourceid>FETCH-LOGICAL-c452t-f075406786fb84692d681fecd962573e14ef2e6034d754b87e865e69c7d174053</originalsourceid><addsrcrecordid>eNqNkc1uFDEQhEcoSCSBR0CaC1IuM7Q9_tsTiqIkIEXiAmfLa7fBi8de7NmJeHuc7IprOHUfvupSV3XdewIjASI-7kZMa81-GSkQNoIagfBX3TlRchqEpOKs7ZNggyKcvukuat0BNISy885dpx5XEw9mCTn1vpgZH3P51ftc-pD2h6VfTQlmG7GvGNE-Yyb-yCUsP-f6zDX7UHKaMS0m9s4sZnAlrJj6OTuM9W332ptY8d1pXnbf726_3XweHr7ef7m5fhgs43QZPEjOQEgl_FYxsaFOKOLRuo2gXE5IGHqKAibmGrhVEpXgKDZWOiIZ8Omyuzre3Zf8-4B10XOoFmM0CfOhaiIEAGNc_g_KJJCJCdFQfkRtybUW9HpfwmzKH01APxWgd_pUgH4qQIPSLd2m-3CyMNWa2KJNNtR_YrqBSbZvG_fpyLWkcA1YdLUBk0UXSstbuxxecPoLC7ifSQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1647013466</pqid></control><display><type>article</type><title>An evaluation framework for input variable selection algorithms for environmental data-driven models</title><source>ScienceDirect Freedom Collection</source><creator>Galelli, Stefano ; Humphrey, Greer B. ; Maier, Holger R. ; Castelletti, Andrea ; Dandy, Graeme C. ; Gibbs, Matthew S.</creator><creatorcontrib>Galelli, Stefano ; Humphrey, Greer B. ; Maier, Holger R. ; Castelletti, Andrea ; Dandy, Graeme C. ; Gibbs, Matthew S.</creatorcontrib><description>Input Variable Selection (IVS) is an essential step in the development of data-driven models and is particularly relevant in environmental modelling. While new methods for identifying important model inputs continue to emerge, each has its own advantages and limitations and no single method is best suited to all datasets and modelling purposes. Rigorous evaluation of new and existing input variable selection methods would allow the effectiveness of these algorithms to be properly identified in various circumstances. However, such evaluations are largely neglected due to the lack of guidelines or precedent to facilitate consistent and standardised assessment. In this paper, a new framework is proposed for the evaluation and inter-comparison of IVS methods which takes into account: (1) a wide range of dataset properties that are relevant to real world environmental data, (2) assessment criteria selected to highlight algorithm suitability in different situations of interest, and (3) a website for sharing data, algorithms and results (http://ivs4em.deib.polimi.it/). The framework is demonstrated on four IVS algorithms commonly used in environmental modelling studies and twenty-six datasets exhibiting different typical properties of environmental data. The main aim at this stage is to demonstrate the application of the proposed evaluation framework, rather than provide a definitive answer as to which of these algorithms has the best overall performance. Nevertheless, the results indicate interesting differences in the algorithms' performance that have not been identified previously. •A framework for the evaluation of input variable selection algorithms is proposed.•The framework consists of assessment criteria and twenty-six datasets.•The framework is supported by a dedicated website (http://ivs4em.deib.polimi.it).•Four popular IVS algorithms are considered for evaluation purposes.</description><identifier>ISSN: 1364-8152</identifier><identifier>EISSN: 1873-6726</identifier><identifier>DOI: 10.1016/j.envsoft.2014.08.015</identifier><language>eng</language><publisher>Oxford: Elsevier Ltd</publisher><subject>Algorithms ; Animal, plant and microbial ecology ; Artificial neural networks ; Assessments ; Biological and medical sciences ; Computer programs ; Data-driven modelling ; Evaluation framework ; Fundamental and applied biological sciences. Psychology ; General aspects. Techniques ; Guidelines ; Input variable selection ; Large environmental datasets ; Mathematical models ; Methods and techniques (sampling, tagging, trapping, modelling...) ; Modelling ; Software</subject><ispartof>Environmental modelling &amp; software : with environment data news, 2014-12, Vol.62, p.33-51</ispartof><rights>2014 Elsevier Ltd</rights><rights>2015 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c452t-f075406786fb84692d681fecd962573e14ef2e6034d754b87e865e69c7d174053</citedby><cites>FETCH-LOGICAL-c452t-f075406786fb84692d681fecd962573e14ef2e6034d754b87e865e69c7d174053</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=29037067$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Galelli, Stefano</creatorcontrib><creatorcontrib>Humphrey, Greer B.</creatorcontrib><creatorcontrib>Maier, Holger R.</creatorcontrib><creatorcontrib>Castelletti, Andrea</creatorcontrib><creatorcontrib>Dandy, Graeme C.</creatorcontrib><creatorcontrib>Gibbs, Matthew S.</creatorcontrib><title>An evaluation framework for input variable selection algorithms for environmental data-driven models</title><title>Environmental modelling &amp; software : with environment data news</title><description>Input Variable Selection (IVS) is an essential step in the development of data-driven models and is particularly relevant in environmental modelling. While new methods for identifying important model inputs continue to emerge, each has its own advantages and limitations and no single method is best suited to all datasets and modelling purposes. Rigorous evaluation of new and existing input variable selection methods would allow the effectiveness of these algorithms to be properly identified in various circumstances. However, such evaluations are largely neglected due to the lack of guidelines or precedent to facilitate consistent and standardised assessment. In this paper, a new framework is proposed for the evaluation and inter-comparison of IVS methods which takes into account: (1) a wide range of dataset properties that are relevant to real world environmental data, (2) assessment criteria selected to highlight algorithm suitability in different situations of interest, and (3) a website for sharing data, algorithms and results (http://ivs4em.deib.polimi.it/). The framework is demonstrated on four IVS algorithms commonly used in environmental modelling studies and twenty-six datasets exhibiting different typical properties of environmental data. The main aim at this stage is to demonstrate the application of the proposed evaluation framework, rather than provide a definitive answer as to which of these algorithms has the best overall performance. Nevertheless, the results indicate interesting differences in the algorithms' performance that have not been identified previously. •A framework for the evaluation of input variable selection algorithms is proposed.•The framework consists of assessment criteria and twenty-six datasets.•The framework is supported by a dedicated website (http://ivs4em.deib.polimi.it).•Four popular IVS algorithms are considered for evaluation purposes.</description><subject>Algorithms</subject><subject>Animal, plant and microbial ecology</subject><subject>Artificial neural networks</subject><subject>Assessments</subject><subject>Biological and medical sciences</subject><subject>Computer programs</subject><subject>Data-driven modelling</subject><subject>Evaluation framework</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects. Techniques</subject><subject>Guidelines</subject><subject>Input variable selection</subject><subject>Large environmental datasets</subject><subject>Mathematical models</subject><subject>Methods and techniques (sampling, tagging, trapping, modelling...)</subject><subject>Modelling</subject><subject>Software</subject><issn>1364-8152</issn><issn>1873-6726</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><recordid>eNqNkc1uFDEQhEcoSCSBR0CaC1IuM7Q9_tsTiqIkIEXiAmfLa7fBi8de7NmJeHuc7IprOHUfvupSV3XdewIjASI-7kZMa81-GSkQNoIagfBX3TlRchqEpOKs7ZNggyKcvukuat0BNISy885dpx5XEw9mCTn1vpgZH3P51ftc-pD2h6VfTQlmG7GvGNE-Yyb-yCUsP-f6zDX7UHKaMS0m9s4sZnAlrJj6OTuM9W332ptY8d1pXnbf726_3XweHr7ef7m5fhgs43QZPEjOQEgl_FYxsaFOKOLRuo2gXE5IGHqKAibmGrhVEpXgKDZWOiIZ8Omyuzre3Zf8-4B10XOoFmM0CfOhaiIEAGNc_g_KJJCJCdFQfkRtybUW9HpfwmzKH01APxWgd_pUgH4qQIPSLd2m-3CyMNWa2KJNNtR_YrqBSbZvG_fpyLWkcA1YdLUBk0UXSstbuxxecPoLC7ifSQ</recordid><startdate>20141201</startdate><enddate>20141201</enddate><creator>Galelli, Stefano</creator><creator>Humphrey, Greer B.</creator><creator>Maier, Holger R.</creator><creator>Castelletti, Andrea</creator><creator>Dandy, Graeme C.</creator><creator>Gibbs, Matthew S.</creator><general>Elsevier Ltd</general><general>Elsevier</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QH</scope><scope>7ST</scope><scope>7UA</scope><scope>C1K</scope><scope>F1W</scope><scope>H97</scope><scope>L.G</scope><scope>SOI</scope><scope>7SC</scope><scope>7SU</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20141201</creationdate><title>An evaluation framework for input variable selection algorithms for environmental data-driven models</title><author>Galelli, Stefano ; Humphrey, Greer B. ; Maier, Holger R. ; Castelletti, Andrea ; Dandy, Graeme C. ; Gibbs, Matthew S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c452t-f075406786fb84692d681fecd962573e14ef2e6034d754b87e865e69c7d174053</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Algorithms</topic><topic>Animal, plant and microbial ecology</topic><topic>Artificial neural networks</topic><topic>Assessments</topic><topic>Biological and medical sciences</topic><topic>Computer programs</topic><topic>Data-driven modelling</topic><topic>Evaluation framework</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects. Techniques</topic><topic>Guidelines</topic><topic>Input variable selection</topic><topic>Large environmental datasets</topic><topic>Mathematical models</topic><topic>Methods and techniques (sampling, tagging, trapping, modelling...)</topic><topic>Modelling</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Galelli, Stefano</creatorcontrib><creatorcontrib>Humphrey, Greer B.</creatorcontrib><creatorcontrib>Maier, Holger R.</creatorcontrib><creatorcontrib>Castelletti, Andrea</creatorcontrib><creatorcontrib>Dandy, Graeme C.</creatorcontrib><creatorcontrib>Gibbs, Matthew S.</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Aqualine</collection><collection>Environment Abstracts</collection><collection>Water Resources Abstracts</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) 3: Aquatic Pollution &amp; Environmental Quality</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) Professional</collection><collection>Environment Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Environmental Engineering Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Environmental modelling &amp; software : with environment data news</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Galelli, Stefano</au><au>Humphrey, Greer B.</au><au>Maier, Holger R.</au><au>Castelletti, Andrea</au><au>Dandy, Graeme C.</au><au>Gibbs, Matthew S.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An evaluation framework for input variable selection algorithms for environmental data-driven models</atitle><jtitle>Environmental modelling &amp; software : with environment data news</jtitle><date>2014-12-01</date><risdate>2014</risdate><volume>62</volume><spage>33</spage><epage>51</epage><pages>33-51</pages><issn>1364-8152</issn><eissn>1873-6726</eissn><abstract>Input Variable Selection (IVS) is an essential step in the development of data-driven models and is particularly relevant in environmental modelling. While new methods for identifying important model inputs continue to emerge, each has its own advantages and limitations and no single method is best suited to all datasets and modelling purposes. Rigorous evaluation of new and existing input variable selection methods would allow the effectiveness of these algorithms to be properly identified in various circumstances. However, such evaluations are largely neglected due to the lack of guidelines or precedent to facilitate consistent and standardised assessment. In this paper, a new framework is proposed for the evaluation and inter-comparison of IVS methods which takes into account: (1) a wide range of dataset properties that are relevant to real world environmental data, (2) assessment criteria selected to highlight algorithm suitability in different situations of interest, and (3) a website for sharing data, algorithms and results (http://ivs4em.deib.polimi.it/). The framework is demonstrated on four IVS algorithms commonly used in environmental modelling studies and twenty-six datasets exhibiting different typical properties of environmental data. The main aim at this stage is to demonstrate the application of the proposed evaluation framework, rather than provide a definitive answer as to which of these algorithms has the best overall performance. Nevertheless, the results indicate interesting differences in the algorithms' performance that have not been identified previously. •A framework for the evaluation of input variable selection algorithms is proposed.•The framework consists of assessment criteria and twenty-six datasets.•The framework is supported by a dedicated website (http://ivs4em.deib.polimi.it).•Four popular IVS algorithms are considered for evaluation purposes.</abstract><cop>Oxford</cop><pub>Elsevier Ltd</pub><doi>10.1016/j.envsoft.2014.08.015</doi><tpages>19</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1364-8152
ispartof Environmental modelling & software : with environment data news, 2014-12, Vol.62, p.33-51
issn 1364-8152
1873-6726
language eng
recordid cdi_proquest_miscellaneous_1660044575
source ScienceDirect Freedom Collection
subjects Algorithms
Animal, plant and microbial ecology
Artificial neural networks
Assessments
Biological and medical sciences
Computer programs
Data-driven modelling
Evaluation framework
Fundamental and applied biological sciences. Psychology
General aspects. Techniques
Guidelines
Input variable selection
Large environmental datasets
Mathematical models
Methods and techniques (sampling, tagging, trapping, modelling...)
Modelling
Software
title An evaluation framework for input variable selection algorithms for environmental data-driven models
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T17%3A00%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20evaluation%20framework%20for%20input%20variable%20selection%20algorithms%20for%20environmental%20data-driven%20models&rft.jtitle=Environmental%20modelling%20&%20software%20:%20with%20environment%20data%20news&rft.au=Galelli,%20Stefano&rft.date=2014-12-01&rft.volume=62&rft.spage=33&rft.epage=51&rft.pages=33-51&rft.issn=1364-8152&rft.eissn=1873-6726&rft_id=info:doi/10.1016/j.envsoft.2014.08.015&rft_dat=%3Cproquest_cross%3E1660044575%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c452t-f075406786fb84692d681fecd962573e14ef2e6034d754b87e865e69c7d174053%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1647013466&rft_id=info:pmid/&rfr_iscdi=true