Loading…
Shapley Values for Explaining the Black Box Nature of Machine Learning Model Clustering
Machine learning (ML) models are becoming increasingly complex. In fact, a sophisticated model (XGBoost boosting or deep learning) generally leads to more accurate predictions than a simple model (linear regression or decision tree). There is therefore a trade-off between the performance of a model...
Saved in:
Published in: | Procedia computer science 2023, Vol.220, p.806-811 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Machine learning (ML) models are becoming increasingly complex. In fact, a sophisticated model (XGBoost boosting or deep learning) generally leads to more accurate predictions than a simple model (linear regression or decision tree). There is therefore a trade-off between the performance of a model and its interpretability: what a model gains in performance, it loses in interpretability (and vice versa), where interpretability is the ability for a human to understand the reasons for a model's decision. However, explaining the predictions made by machine learning models aims at computing and interpreting the importance of features. To achieve this, game theory has recently gained attention for better understanding the similarity between group members. In this paper, we use SHAP (SHapley Additive exPlanations), which is a method based on cooperative game theory, to analyze and evaluate the properties of each group. More importantly, we rely k-means PCA and Light gbm classifier to improve the data preparation before grouping the features into multiple clusters. The simulation results prove the importance of shapley value in creating an accurate and meaningful representation of each cluster. |
---|---|
ISSN: | 1877-0509 1877-0509 |
DOI: | 10.1016/j.procs.2023.03.107 |