Loading…
Visual interpretation of deep learning model in ECG classification: A comprehensive evaluation of feature attribution methods
Feature attribution methods can visually highlight specific input regions containing influential aspects affecting a deep learning model's prediction. Recently, the use of feature attribution methods in electrocardiogram (ECG) classification has been sharply increasing, as they assist clinician...
Saved in:
Published in: | Computers in biology and medicine 2024-11, Vol.182, p.109088, Article 109088 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Feature attribution methods can visually highlight specific input regions containing influential aspects affecting a deep learning model's prediction. Recently, the use of feature attribution methods in electrocardiogram (ECG) classification has been sharply increasing, as they assist clinicians in understanding the model's decision-making process and assessing the model's reliability. However, a careful study to identify suitable methods for ECG datasets has been lacking, leading researchers to select methods without a thorough understanding of their appropriateness. In this work, we conduct a large-scale assessment by considering eleven popular feature attribution methods across five large ECG datasets using a model based on the ResNet-18 architecture. Our experiments include both automatic evaluations and human evaluations. Annotated datasets were utilized for automatic evaluations and three cardiac experts were involved for human evaluations. We found that Guided Grad-CAM, particularly when its absolute values are utilized, achieves the best performance. When Guided Grad-CAM was utilized as the feature attribution method, cardiac experts confirmed that it can identify diagnostically relevant electrophysiological characteristics, although its effectiveness varied across the 17 different diagnoses that we have investigated.
•A large-scale assessment of feature attribution methods is provided.•Eleven feature attribution methods are considered over five large ECG datasets.•Both automatic and human evaluations are performed.•In our experiments, Guided Grad-CAM exhibited outstanding performance. |
---|---|
ISSN: | 0010-4825 1879-0534 1879-0534 |
DOI: | 10.1016/j.compbiomed.2024.109088 |