Loading…

Shapley Values for Explaining the Black Box Nature of Machine Learning Model Clustering

Machine learning (ML) models are becoming increasingly complex. In fact, a sophisticated model (XGBoost boosting or deep learning) generally leads to more accurate predictions than a simple model (linear regression or decision tree). There is therefore a trade-off between the performance of a model...

Full description

Saved in:

Bibliographic Details
Published in:	Procedia computer science 2023, Vol.220, p.806-811
Main Authors:	Louhichi, Mouad, Nesmaoui, Redwane, Mbarek, Marwan, Lazaar, Mohamed
Format:	Article
Language:	English
Subjects:	Clustering Dimensionality reduction Explainability Game theory Machine learning Shapley Value Visualization
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Machine learning (ML) models are becoming increasingly complex. In fact, a sophisticated model (XGBoost boosting or deep learning) generally leads to more accurate predictions than a simple model (linear regression or decision tree). There is therefore a trade-off between the performance of a model and its interpretability: what a model gains in performance, it loses in interpretability (and vice versa), where interpretability is the ability for a human to understand the reasons for a model's decision. However, explaining the predictions made by machine learning models aims at computing and interpreting the importance of features. To achieve this, game theory has recently gained attention for better understanding the similarity between group members. In this paper, we use SHAP (SHapley Additive exPlanations), which is a method based on cooperative game theory, to analyze and evaluate the properties of each group. More importantly, we rely k-means PCA and Light gbm classifier to improve the data preparation before grouping the features into multiple clusters. The simulation results prove the importance of shapley value in creating an accurate and meaningful representation of each cluster.
ISSN:	1877-0509 1877-0509
DOI:	10.1016/j.procs.2023.03.107