Loading…

Database quality assessment for interactive learning: Application to occupancy estimation

•A novel approach to assess the quality of training data has been proposed, in particular, two scores (Sscore and Qscore) are calculated depending on a new concept, called spread rate.•Spread rate is successfully applied to the occupancy estimation application via an interactive learning approach.•I...

Full description

Saved in:
Bibliographic Details
Published in:Energy and buildings 2020-02, Vol.209, p.109578, Article 109578
Main Authors: Amayri, Manar, Ploix, Stephane, Bouguila, Nizar, Wurtz, Frederic
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•A novel approach to assess the quality of training data has been proposed, in particular, two scores (Sscore and Qscore) are calculated depending on a new concept, called spread rate.•Spread rate is successfully applied to the occupancy estimation application via an interactive learning approach.•Interactive learning is an extension of a supervised learning machine that in our case will estimate the occupancy by collecting the required labeling from the occupants themselves.•The question of when asking to occupants is investigated due to the database quality.•Spread rate is compared with the density of the neighborhood, while the spread rate replaces the density of the neighborhood. It moves from a local criterion to a global one (instead of counting the records, it checks how records are globally distributed). Data quality assesment is a key component for many real applications, since it can drive better modelling. In this work a methodology to asses data quality (Qscore) is proposed and discussed. The validation of Qscore is performed via an interactive learning experiment related to occupancy estimation. Interactive learning has been shown to be crucial to consider and integrate occupant behavior in smart buildings. Indeed, valuable feedback and information can be collected from the occupants by involving them and by improving their consciousness about energy management systems. Users should feel involved to keep developing highly energy-efficient buildings. To reach this goal, occupants should be aware of the building features to feel more in control. This paper proposes a framework to interact with occupants to estimate building occupancy. This framework is based on an enhanced supervised learning approach that involves interaction with occupants, when necessary, to keep collecting training data. The training data consist of the measurements (i.e. features) collected from common sensors, for instance, motion detection, power consumption, and CO2 concentration, and the label (i.e. number of occupants) provided by the occupants during interactions. The considered learning machine in our experiments is the Multi-layer Perceptron regressor (MLP), although other approaches could be easily integrated within the proposed framework. In order to avoid useless interaction with users a new concept is introduced, called spread rate, to measure the quality of the data to decide if an interaction with the user is necessary or not. Extensive simulations have shown
ISSN:0378-7788
1872-6178
DOI:10.1016/j.enbuild.2019.109578