Loading…
iTabNet: an improved neural network for tabular data and its application to predict socioeconomic and environmental attributes
There is a growing application of machine learning methods to predict socioeconomic and environmental attributes in computational social science, where big data are usually presented in tabular format. However, it is still a challenge to develop novel deep learning models to deal with tabular data,...
Saved in:
Published in: | Neural computing & applications 2023-05, Vol.35 (15), p.11389-11402 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | There is a growing application of machine learning methods to predict socioeconomic and environmental attributes in computational social science, where big data are usually presented in tabular format. However, it is still a challenge to develop novel deep learning models to deal with tabular data, fill missing value, improve prediction accuracy, and enhance interpretability. In this study, we for the first time apply a tabular deep learning methodology (TabNet) to predict socioeconomic and environmental attributes (number of population and companies, volume of consumption, poker players’ behaviors, forest cover, etc.). Furthermore, we develop a new network architecture, referred to as improved TabNet (iTabNet), that can simultaneously learn local and global features in the tabular data to improve prediction accuracy. We also introduce a difference loss to constrain the feature selection process in iTabNet so that the model can use different features at different steps to enhance interpretability. To deal with missing values, we introduce a fusion strategy based on data mean and Auto-Encoder network to efficiently complete a more reasonable value filling. Experimental results demonstrate that the proposed iTabNet achieves competitive performances in the application to predict socioeconomic and environmental attributes based on tabular data, iTabNet using the proposed fusion strategy significantly outperforms other machine learning models when tabular data have missing values. |
---|---|
ISSN: | 0941-0643 1433-3058 |
DOI: | 10.1007/s00521-023-08304-7 |