Loading…
Explaining heatwaves with machine learning
Heatwaves are known to arise from the interplay between large‐scale climate variability, synoptic weather patterns, and regional to local‐scale surface processes. Though recent research has made important progress for each individual contributing factor, ways to properly incorporate multiple or all...
Saved in:
Published in: | Quarterly journal of the Royal Meteorological Society 2024-04, Vol.150 (760), p.1207-1221 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Heatwaves are known to arise from the interplay between large‐scale climate variability, synoptic weather patterns, and regional to local‐scale surface processes. Though recent research has made important progress for each individual contributing factor, ways to properly incorporate multiple or all of them in a unified analysis are still lacking. In this study, we consider a wide range of possible predictor variables from the ERA5 Reanalysis version 5 and ask how much information on heatwave occurrence in Europe can be learned from each of them. To simplify the problem, we first adapt the recently developed logistic principal component analysis to the task of compressing large binary heatwave fields to a small number of interpretable principal components. The relationships between heatwaves and various climate variables can then be learned by a neural network. Starting from the simple notion that the importance of a variable is given by its impact on the performance of our statistical model, we arrive naturally at the definition of Shapley values. Classic results of game theory show that this is the only fair way of distributing the overall success of a model among its inputs. We find a nonlinear model that explains 70% of reduced heatwave variability. The biggest individual contribution (27% of the 70%) comes from upper level geopotential; top‐level soil moisture is in second place (15%). Beyond this decomposition, Shapley interaction values enable us to quantify overlapping information and positive synergies between all pairs of predictors.
An explanation of heatwaves is constructed in three steps: (1) compressing the binary heatwave field with a newly adapted logistic principal component analysis; (2) modeling the relationship to geopotential, soil moisture, and other relevant climate variables with a neural network; and (3) decomposing the success of the model into contributions from each variable via Shapley values. We find that soil moisture and 500 hPa geopotential are relevant across Europe, the relative contributions varying from region to region. |
---|---|
ISSN: | 0035-9009 1477-870X |
DOI: | 10.1002/qj.4642 |