Loading…

iTabNet: an improved neural network for tabular data and its application to predict socioeconomic and environmental attributes

There is a growing application of machine learning methods to predict socioeconomic and environmental attributes in computational social science, where big data are usually presented in tabular format. However, it is still a challenge to develop novel deep learning models to deal with tabular data,...

Full description

Saved in:
Bibliographic Details
Published in:Neural computing & applications 2023-05, Vol.35 (15), p.11389-11402
Main Authors: Liu, Junmin, Tian, Tian, Liu, Yunxia, Hu, Sufeng, Li, Mengyao
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:There is a growing application of machine learning methods to predict socioeconomic and environmental attributes in computational social science, where big data are usually presented in tabular format. However, it is still a challenge to develop novel deep learning models to deal with tabular data, fill missing value, improve prediction accuracy, and enhance interpretability. In this study, we for the first time apply a tabular deep learning methodology (TabNet) to predict socioeconomic and environmental attributes (number of population and companies, volume of consumption, poker players’ behaviors, forest cover, etc.). Furthermore, we develop a new network architecture, referred to as improved TabNet (iTabNet), that can simultaneously learn local and global features in the tabular data to improve prediction accuracy. We also introduce a difference loss to constrain the feature selection process in iTabNet so that the model can use different features at different steps to enhance interpretability. To deal with missing values, we introduce a fusion strategy based on data mean and Auto-Encoder network to efficiently complete a more reasonable value filling. Experimental results demonstrate that the proposed iTabNet achieves competitive performances in the application to predict socioeconomic and environmental attributes based on tabular data, iTabNet using the proposed fusion strategy significantly outperforms other machine learning models when tabular data have missing values.
ISSN:0941-0643
1433-3058
DOI:10.1007/s00521-023-08304-7