Loading…

Target inductive methods for zero-shot regression

•Provide general-purpose zero-shot methods for regression.•Strategies for exploiting side information in the zero-shot regression task.•Application to a real problem about air pollution prediction. This research arises from the need to predict the amount of air pollutants in meteorological stations....

Full description

Saved in:
Bibliographic Details
Published in:Information sciences 2022-06, Vol.599, p.44-63
Main Authors: Fdez-Díaz, Miriam, Quevedo, José Ramón, Montañés, Elena
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•Provide general-purpose zero-shot methods for regression.•Strategies for exploiting side information in the zero-shot regression task.•Application to a real problem about air pollution prediction. This research arises from the need to predict the amount of air pollutants in meteorological stations. Air pollution depends on the location of the stations (weather conditions and activities in the surroundings). Frequently, the surrounding information is not considered in the learning process. This information is known beforehand in the absence of unobserved weather conditions and remains constant for the same station. Considering the surrounding information as side information facilitates the generalization for predicting pollutants in new stations, leading to a zero-shot regression scenario. Available methods in zero-shot typically lean towards classificat and are not easily extensible to regression. This paper proposes two zero-shot methods for regression. The first method is a similarity based approach that learns models from features and aggregates them using side information. However, potential knowledge of the feature models may be lost in the aggregation. The second method overcomes this drawback by replacing the aggregation procedure and learning the correspondence between side information and feature-induced models, instead. Both proposals are compared with a baseline procedure using artificial datasets, UCI repository communities and crime datasets, and the pollutants. Both approaches outperform the baseline method, but the parameter learning approach manifests its superiority over the similarity based method.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2022.03.075