Loading…

Investigation of the influence of nonoccurrence sampling on landslide susceptibility assessment using Artificial Neural Networks

•Nonoccurrence samples acquisition is relevant to resulting susceptibility maps.•ANN models can lose generalization ability if training set is too easy to classify.•Using only easy-to-classify nonoccurrence samples affects maps negatively. Landslide susceptibility assessment using Artificial Neural...

Full description

Saved in:
Bibliographic Details
Published in:Catena (Giessen) 2021-03, Vol.198, p.105067, Article 105067
Main Authors: Lucchese, Luísa Vieira, de Oliveira, Guilherme Garcia, Pedrollo, Olavo Correa
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•Nonoccurrence samples acquisition is relevant to resulting susceptibility maps.•ANN models can lose generalization ability if training set is too easy to classify.•Using only easy-to-classify nonoccurrence samples affects maps negatively. Landslide susceptibility assessment using Artificial Neural Networks (ANNs) requires occurrence (landslide) and nonoccurrence (not prone to landslide) samples for ANN training. We present empirical evidence that a priori intervention on the nonoccurrence samples can produce models that are improper for generalization. Thirteen nonoccurrence cases based on GIS data from Rolante River basin (828.26 km2) in Brazil are studied, divided in three groups. The first group was based on six combinations of buffers with different minimum and maximum distances from the mapped scars (BO). The second group (RO) acquired nonoccurrence only from a rectangle in the lowlands, known for not being susceptible to landslides. For BR, six alternatives respectively to BO were presented, with the inclusion of nonoccurrence samples acquired from the same rectangle used for RO. Accuracy (acc) and the Area Under Receiving Operating Characteristic Curve (AUC) were calculated. RO resulted in perfect discrimination between susceptible and not susceptible to landslides (acc = 1 e AUC = 1). This occurred because the model simply provided susceptible classification to points in which attributes are different from those in the rectangle, harming the classification of nonoccurrence sampling points outside the rectangle. RO map shows large areas classified as susceptible which are known to be non-susceptible. In BR, sampling points from the rectangle, which are easy to classify, were added to the verification sample of BR. Average acc for BO 00 m (minimum buffer distance to scars of 0 m): 89.45%, average acc for BR 00 m: 92.33%, average AUC for BO 00 m: 0.9409, average AUC for BR 00 m: 0.9616. Maps of groups BO and BR were alike. This indicates that metrics can be artificially risen if biased samples are added, although the final map is not visibly affected. To avoid this effect, the employment of easily classifiable samples, generated based on expert knowledge, should be made carefully.
ISSN:0341-8162
1872-6887
DOI:10.1016/j.catena.2020.105067