Loading…

A Comparative Study of Machine Learning and Spatial Interpolation Methods for Predicting House Prices

As the volume of spatial data has rapidly increased over the last several decades, there is a growing concern about missing and incomplete observations that may result in biased conclusions. Several recent studies have reported that machine learning techniques can more efficiently address this limit...

Full description

Saved in:

Bibliographic Details
Published in:	Sustainability 2022-08, Vol.14 (15), p.9056
Main Authors:	Kim, Jeonghyeon, Lee, Youngho, Lee, Myeong-Hun, Hong, Seong-Yun
Format:	Article
Language:	English
Subjects:	Accuracy Comparative analysis Comparative studies Datasets Environmental science Housing prices Integrated approach Learning algorithms Machine learning Neural networks Real estate Spatial data Sustainability Variables Weighting
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	As the volume of spatial data has rapidly increased over the last several decades, there is a growing concern about missing and incomplete observations that may result in biased conclusions. Several recent studies have reported that machine learning techniques can more efficiently address this limitation in emerging data sets than conventional interpolation approaches, such as inverse distance weighting and kriging. However, most existing studies focus on data from environmental sciences; so, further evaluations are required to assess their strengths and limitations for socioeconomic data, such as house price data. In this study, we conducted a comparative analysis of four commonly used methods: neural networks, random forests, inverse distance weighting, and kriging. We applied these methods to the real estate transaction data of Seoul, South Korea, and demonstrated how the values of the houses at which no transactions are recorded could be predicted. Our empirical analysis suggested that the neural networks and random forests can provide a more accurate estimation than the interpolation counterparts. Of the two machine learning techniques, the results from a random forest model were slightly better than those from a neural network model. However, the neural network appeared to be more sensitive to the amount of training data, implying that it has the potential to outperform the other methods when there are sufficient data available for training.
ISSN:	2071-1050 2071-1050
DOI:	10.3390/su14159056