Loading…

Demonstrating the Risk of Imbalanced Datasets in Chest X-Ray Image-Based Diagnostics by Prototypical Relevance Propagation

The recent trend of integrating multi-source Chest X-Ray datasets to improve automated diagnostics raises concerns that models learn to exploit source-specific correlations to improve performance by recognizing the source domain of an image rather than the medical pathology. We hypothesize that this...

Full description

Saved in:
Bibliographic Details
Main Authors: Gautam, Srishti, Hohne, Marina M.-C., Hansen, Stine, Jenssen, Robert, Kampffmeyer, Michael
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 5
container_issue
container_start_page 1
container_title
container_volume
creator Gautam, Srishti
Hohne, Marina M.-C.
Hansen, Stine
Jenssen, Robert
Kampffmeyer, Michael
description The recent trend of integrating multi-source Chest X-Ray datasets to improve automated diagnostics raises concerns that models learn to exploit source-specific correlations to improve performance by recognizing the source domain of an image rather than the medical pathology. We hypothesize that this effect is enforced by and leverages label-imbalance across the source domains, i.e, prevalence of a disease corresponding to a source. Therefore, in this work, we perform a thorough study of the effect of label-imbalance in multi-source training for the task of pneumonia detection on the widely used ChestX-ray14 and CheXpert datasets. The results highlight and stress the importance of using more faithful and transparent self-explaining models for automated diagnosis, thus enabling the inherent detection of spurious learning. They further illustrate that this undesirable effect of learning spurious correlations can be reduced considerably when ensuring label-balanced source domain datasets.
doi_str_mv 10.1109/ISBI52829.2022.9761651
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_9761651</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9761651</ieee_id><sourcerecordid>9761651</sourcerecordid><originalsourceid>FETCH-LOGICAL-i203t-b4192fe2d35fb672ec6c7a04c84f3f503b86f7d7adde227a918cf11d438450a73</originalsourceid><addsrcrecordid>eNotkN1Kw0AQhVdBsNY-gSD7Aqm7s5tscmlbfwIFpSp4VybJbLqaJiW7CPHpTbFzM3DOxxnOMHYrxVxKkd3lb4s8hhSyOQiAeWYSmcTyjF3JJIk1ZKDgnE1kpuMo1TFcspn3X2Ico7USesJ-V7TvWh96DK6tedgR3zj_zTvL832BDbYlVXyFAT0Fz13LlzvygX9GGxxGBGuKFqM3Mg7rtvPBlZ4XA3_tu9CF4eBKbPiGGvo5Rh3lA9bjsa69ZhcWG0-z056yj8eH9-VztH55ypf368iBUCEqtMzAElQqtkVigMqkNCh0mWqrbCxUkSbWVAarigAMZjItrZSVVmNjgUZN2c1_riOi7aF3e-yH7elV6g_itF-n</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Demonstrating the Risk of Imbalanced Datasets in Chest X-Ray Image-Based Diagnostics by Prototypical Relevance Propagation</title><source>IEEE Xplore All Conference Series</source><creator>Gautam, Srishti ; Hohne, Marina M.-C. ; Hansen, Stine ; Jenssen, Robert ; Kampffmeyer, Michael</creator><creatorcontrib>Gautam, Srishti ; Hohne, Marina M.-C. ; Hansen, Stine ; Jenssen, Robert ; Kampffmeyer, Michael</creatorcontrib><description>The recent trend of integrating multi-source Chest X-Ray datasets to improve automated diagnostics raises concerns that models learn to exploit source-specific correlations to improve performance by recognizing the source domain of an image rather than the medical pathology. We hypothesize that this effect is enforced by and leverages label-imbalance across the source domains, i.e, prevalence of a disease corresponding to a source. Therefore, in this work, we perform a thorough study of the effect of label-imbalance in multi-source training for the task of pneumonia detection on the widely used ChestX-ray14 and CheXpert datasets. The results highlight and stress the importance of using more faithful and transparent self-explaining models for automated diagnosis, thus enabling the inherent detection of spurious learning. They further illustrate that this undesirable effect of learning spurious correlations can be reduced considerably when ensuring label-balanced source domain datasets.</description><identifier>EISSN: 1945-8452</identifier><identifier>EISBN: 1665429232</identifier><identifier>EISBN: 9781665429238</identifier><identifier>DOI: 10.1109/ISBI52829.2022.9761651</identifier><language>eng</language><publisher>IEEE</publisher><subject>Artifact Detection ; Biological system modeling ; Chest X-Ray ; Correlation ; Detectors ; Explainable AI ; Pathology ; Pulmonary diseases ; Real-time systems ; Self-Explaining Models ; Spurious Learning ; Training</subject><ispartof>2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI), 2022, p.1-5</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9761651$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,777,781,786,787,23911,23912,25121,27906,54536,54913</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9761651$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Gautam, Srishti</creatorcontrib><creatorcontrib>Hohne, Marina M.-C.</creatorcontrib><creatorcontrib>Hansen, Stine</creatorcontrib><creatorcontrib>Jenssen, Robert</creatorcontrib><creatorcontrib>Kampffmeyer, Michael</creatorcontrib><title>Demonstrating the Risk of Imbalanced Datasets in Chest X-Ray Image-Based Diagnostics by Prototypical Relevance Propagation</title><title>2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI)</title><addtitle>ISBI</addtitle><description>The recent trend of integrating multi-source Chest X-Ray datasets to improve automated diagnostics raises concerns that models learn to exploit source-specific correlations to improve performance by recognizing the source domain of an image rather than the medical pathology. We hypothesize that this effect is enforced by and leverages label-imbalance across the source domains, i.e, prevalence of a disease corresponding to a source. Therefore, in this work, we perform a thorough study of the effect of label-imbalance in multi-source training for the task of pneumonia detection on the widely used ChestX-ray14 and CheXpert datasets. The results highlight and stress the importance of using more faithful and transparent self-explaining models for automated diagnosis, thus enabling the inherent detection of spurious learning. They further illustrate that this undesirable effect of learning spurious correlations can be reduced considerably when ensuring label-balanced source domain datasets.</description><subject>Artifact Detection</subject><subject>Biological system modeling</subject><subject>Chest X-Ray</subject><subject>Correlation</subject><subject>Detectors</subject><subject>Explainable AI</subject><subject>Pathology</subject><subject>Pulmonary diseases</subject><subject>Real-time systems</subject><subject>Self-Explaining Models</subject><subject>Spurious Learning</subject><subject>Training</subject><issn>1945-8452</issn><isbn>1665429232</isbn><isbn>9781665429238</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2022</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotkN1Kw0AQhVdBsNY-gSD7Aqm7s5tscmlbfwIFpSp4VybJbLqaJiW7CPHpTbFzM3DOxxnOMHYrxVxKkd3lb4s8hhSyOQiAeWYSmcTyjF3JJIk1ZKDgnE1kpuMo1TFcspn3X2Ico7USesJ-V7TvWh96DK6tedgR3zj_zTvL832BDbYlVXyFAT0Fz13LlzvygX9GGxxGBGuKFqM3Mg7rtvPBlZ4XA3_tu9CF4eBKbPiGGvo5Rh3lA9bjsa69ZhcWG0-z056yj8eH9-VztH55ypf368iBUCEqtMzAElQqtkVigMqkNCh0mWqrbCxUkSbWVAarigAMZjItrZSVVmNjgUZN2c1_riOi7aF3e-yH7elV6g_itF-n</recordid><startdate>20220328</startdate><enddate>20220328</enddate><creator>Gautam, Srishti</creator><creator>Hohne, Marina M.-C.</creator><creator>Hansen, Stine</creator><creator>Jenssen, Robert</creator><creator>Kampffmeyer, Michael</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>20220328</creationdate><title>Demonstrating the Risk of Imbalanced Datasets in Chest X-Ray Image-Based Diagnostics by Prototypical Relevance Propagation</title><author>Gautam, Srishti ; Hohne, Marina M.-C. ; Hansen, Stine ; Jenssen, Robert ; Kampffmeyer, Michael</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i203t-b4192fe2d35fb672ec6c7a04c84f3f503b86f7d7adde227a918cf11d438450a73</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Artifact Detection</topic><topic>Biological system modeling</topic><topic>Chest X-Ray</topic><topic>Correlation</topic><topic>Detectors</topic><topic>Explainable AI</topic><topic>Pathology</topic><topic>Pulmonary diseases</topic><topic>Real-time systems</topic><topic>Self-Explaining Models</topic><topic>Spurious Learning</topic><topic>Training</topic><toplevel>online_resources</toplevel><creatorcontrib>Gautam, Srishti</creatorcontrib><creatorcontrib>Hohne, Marina M.-C.</creatorcontrib><creatorcontrib>Hansen, Stine</creatorcontrib><creatorcontrib>Jenssen, Robert</creatorcontrib><creatorcontrib>Kampffmeyer, Michael</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Gautam, Srishti</au><au>Hohne, Marina M.-C.</au><au>Hansen, Stine</au><au>Jenssen, Robert</au><au>Kampffmeyer, Michael</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Demonstrating the Risk of Imbalanced Datasets in Chest X-Ray Image-Based Diagnostics by Prototypical Relevance Propagation</atitle><btitle>2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI)</btitle><stitle>ISBI</stitle><date>2022-03-28</date><risdate>2022</risdate><spage>1</spage><epage>5</epage><pages>1-5</pages><eissn>1945-8452</eissn><eisbn>1665429232</eisbn><eisbn>9781665429238</eisbn><abstract>The recent trend of integrating multi-source Chest X-Ray datasets to improve automated diagnostics raises concerns that models learn to exploit source-specific correlations to improve performance by recognizing the source domain of an image rather than the medical pathology. We hypothesize that this effect is enforced by and leverages label-imbalance across the source domains, i.e, prevalence of a disease corresponding to a source. Therefore, in this work, we perform a thorough study of the effect of label-imbalance in multi-source training for the task of pneumonia detection on the widely used ChestX-ray14 and CheXpert datasets. The results highlight and stress the importance of using more faithful and transparent self-explaining models for automated diagnosis, thus enabling the inherent detection of spurious learning. They further illustrate that this undesirable effect of learning spurious correlations can be reduced considerably when ensuring label-balanced source domain datasets.</abstract><pub>IEEE</pub><doi>10.1109/ISBI52829.2022.9761651</doi><tpages>5</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier EISSN: 1945-8452
ispartof 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI), 2022, p.1-5
issn 1945-8452
language eng
recordid cdi_ieee_primary_9761651
source IEEE Xplore All Conference Series
subjects Artifact Detection
Biological system modeling
Chest X-Ray
Correlation
Detectors
Explainable AI
Pathology
Pulmonary diseases
Real-time systems
Self-Explaining Models
Spurious Learning
Training
title Demonstrating the Risk of Imbalanced Datasets in Chest X-Ray Image-Based Diagnostics by Prototypical Relevance Propagation
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T19%3A12%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Demonstrating%20the%20Risk%20of%20Imbalanced%20Datasets%20in%20Chest%20X-Ray%20Image-Based%20Diagnostics%20by%20Prototypical%20Relevance%20Propagation&rft.btitle=2022%20IEEE%2019th%20International%20Symposium%20on%20Biomedical%20Imaging%20(ISBI)&rft.au=Gautam,%20Srishti&rft.date=2022-03-28&rft.spage=1&rft.epage=5&rft.pages=1-5&rft.eissn=1945-8452&rft_id=info:doi/10.1109/ISBI52829.2022.9761651&rft.eisbn=1665429232&rft.eisbn_list=9781665429238&rft_dat=%3Cieee_CHZPO%3E9761651%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i203t-b4192fe2d35fb672ec6c7a04c84f3f503b86f7d7adde227a918cf11d438450a73%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=9761651&rfr_iscdi=true