Loading…

Road and travel time cross-validation for urban modelling

The physical and social processes in urban systems are inherently spatial and hence data describing them contain spatial autocorrelation (a proximity-based interdependency on a variable) that need to be accounted for. Standard k-fold cross-validation (KCV) techniques that attempt to measure the gene...

Full description

Saved in:
Bibliographic Details
Published in:International journal of geographical information science : IJGIS 2020-01, Vol.34 (1), p.98-118
Main Authors: Crosby, Henry, Damoulas, Theodoros, Jarvis, Stephen A.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The physical and social processes in urban systems are inherently spatial and hence data describing them contain spatial autocorrelation (a proximity-based interdependency on a variable) that need to be accounted for. Standard k-fold cross-validation (KCV) techniques that attempt to measure the generalisation performance of machine learning and statistical algorithms are inappropriate in this setting due to their inherent i.i.d assumption, which is violated by spatial dependency. As such, more appropriate validation methods have been considered, notably blocking and spatial k-fold cross-validation (SKCV). However, the physical barriers and complex network structures which make up a city's landscape mean that these methods are also inappropriate, largely because the travel patterns (and hence Spatial Autocorrelation (SAC)) in most urban spaces are rarely Euclidean in nature. To overcome this problem, we propose a new road distance and travel time k-fold cross-validation method, RT-KCV. We show how this outperforms the prior art in providing better estimates of the true generalisation performance to unseen data.
ISSN:1365-8816
1362-3087
1365-8824
DOI:10.1080/13658816.2019.1658876