Loading…

Prediction and explanation of the formation of the Spanish day-ahead electricity price through machine learning regression

[Display omitted] •We propose a regression-tree-based method for modeling electricity price formation.•The explanatory variables are extracted from publicly accessible energy related data.•The energy-related data are free and published by the TSO in a graphical interface.•The model shows good accura...

Full description

Saved in:
Bibliographic Details
Published in:Applied energy 2019-04, Vol.239, p.610-625
Main Authors: Díaz, Guzmán, Coto, José, Gómez-Aleixandre, Javier
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:[Display omitted] •We propose a regression-tree-based method for modeling electricity price formation.•The explanatory variables are extracted from publicly accessible energy related data.•The energy-related data are free and published by the TSO in a graphical interface.•The model shows good accuracy in predicting the price formation.•It also allows for a non-linear analysis of the dependence of price on predictors. Until recently, detailed information on the power system state to estimate future spot prices by regression analysis was generally restricted to qualified parties. However, to ensure transparency in operation, the Spanish Transmission System Operator has launched an informative web in which a sizable amount of real-time energy-related data can be consulted through a graphical interface. Undoubtedly, this provides the opportunity for non-qualified parties to develop applications and algorithms in which price forecast and maybe knowledge about how price is determined are required. This paper approaches the use of data extracted from that interface with two aims: the prediction of the day-ahead price in a simple way, and the exploration of the influence that the underlying energy drivers have on it. For the prediction we specified a quantile regression model based on Gradient Boosted Regression Trees. It improves the accuracy over multiple linear regression models at the cost of more complexity, and still it has simpler specification and tuning compared to other machine learning approaches. The calculated metrics show that our model produces remarkably low prediction errors when using the median as point prediction method (RMSE = 2.78 €/MWh, MAE = 1.94 €/MWh, and MAPE = 0.059). Interestingly, the quantile regression model also allows to inherently define prediction intervals, with a different interpretation of accuracy. Our results show that on average 90% of times the prediction error will not exceed 6.8 €/MWh. We also implemented a partial dependence analysis on that model. This implementation—as far as we know the first time employed to analyze the formation of electricity prices—has shown to be of significant usefulness in detecting highly non-linear relationships.
ISSN:0306-2619
1872-9118
DOI:10.1016/j.apenergy.2019.01.213