Loading…
A validated ensemble method for multinomial land-cover classification
Land-cover data provides valuable information for landscape management and can be generated using machine learning algorithms. Ensemble models or model averaging can overcome difficulties in selecting an adequate algorithm and improve model predictions, but its use is limited among ecologists. The o...
Saved in:
Published in: | Ecological informatics 2020-03, Vol.56, p.101065, Article 101065 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Land-cover data provides valuable information for landscape management and can be generated using machine learning algorithms. Ensemble models or model averaging can overcome difficulties in selecting an adequate algorithm and improve model predictions, but its use is limited among ecologists. The objective of this study is to highlight the benefits and limitations of weighted and unweighted majority voting ensemble models for land-cover classification and to enable easy and wider implementation of the method by providing an R-script (for use in the R software). Using a case study of three mixed-use landscapes from southern Australia (Tasmania), land cover was classified into six classes using Landsat 8 imagery and ancillary data, and support vector machine, random forest, k-nearest neighbour and naïve Bayesian as base algorithms. The predicted classifications of the base algorithms were then averaged using both an unweighted and weighted (using the true skill statistic) majority voting ensemble algorithm. Cross-validation results showed the base algorithms achieved similar accuracy making algorithm selection difficult. The base algorithms achieved high and similar predictive accuracy when the classified land-cover and training data belong to the same geographic region but lower and different predictive accuracy when the classified land-cover and training data belong to different geographic regions. The weighted and unweighted ensemble achieved similar overall accuracy, equivalent to the best performing base algorithm. We conclude that the majority voting ensemble can be adopted to overcome difficulties in model selection during land-cover classification.
•Using cross-validation resulted in no preferred algorithm for classification.•Different base algorithms achieved the highest accuracy in different scenarios.•The weighted and unweighted ensemble consistently achieved high accuracy.•The ensemble algorithm is provided as a script for use with software R. |
---|---|
ISSN: | 1574-9541 |
DOI: | 10.1016/j.ecoinf.2020.101065 |