Loading…
An assessment of training data for agricultural land cover classification: a case study of Bafra, Türkiye
The training data plays a pivotal role in the accuracy of a machine learning (ML) model in remote sensing. In this case, the set size and purity of the training data have a large influence in classification accuracy. The purpose of this experimental research is to investigate the impact of the diffe...
Saved in:
Published in: | Earth science informatics 2025-01, Vol.18 (1), p.7, Article 7 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The training data plays a pivotal role in the accuracy of a machine learning (ML) model in remote sensing. In this case, the set size and purity of the training data have a large influence in classification accuracy. The purpose of this experimental research is to investigate the impact of the different training set size on supervised machine learning classifiers for the agricultural land cover classification in remote sensing. The training set size for each class was incrementally increased at the following intervals: 1%, 5%, 10%, 20%, 30%, 40%, and 50% in our experiment. The remaining 50% of the full ground truth data was used for evaluating the model’s accuracy. The test site is situated in Bafra Plain, Samsun, Turkey and the agricultural land cover classification was held using multispectral Sentinel-2 imagery with four ML models, namely Support Vector Machines (SVM), Random Forest (RF), Light Gradient Boosting Machines (LightGBM), and Kernel Extreme Learning Machines (KELM). The experimental results demonstrated that the highest classification accuracy was achieved by LightGBM (89.93%), and followed by RF (86.49%), KELM (78.38%) and SVM (72.49%). The classification accuracies of tree-based methods (RF and LightGBM) increased as the training set size grew, however, kernel-based methods (KELM and SVM) exhibited unstable results as the size of the training dataset varied. Furthermore, our findings highlight that each machine learning model demonstrates different sensitivity to variations in training set size with respect to agricultural land cover classification. |
---|---|
ISSN: | 1865-0473 1865-0481 |
DOI: | 10.1007/s12145-024-01555-5 |