Loading…
Towards Visually Explaining Video Understanding Networks with Perturbation
''Making black box models explainable'' is a vital problem that accompanies the development of deep learning networks. For networks taking visual information as input, one basic but challenging explanation method is to identify and visualize the input pixels/regions that dominate...
Saved in:
Published in: | arXiv.org 2020-11 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Li, Zhenqiang Wang, Weimin Li, Zuoyue Huang, Yifei Sato, Yoichi |
description | ''Making black box models explainable'' is a vital problem that accompanies the development of deep learning networks. For networks taking visual information as input, one basic but challenging explanation method is to identify and visualize the input pixels/regions that dominate the network's prediction. However, most existing works focus on explaining networks taking a single image as input and do not consider the temporal relationship that exists in videos. Providing an easy-to-use visual explanation method that is applicable to diversified structures of video understanding networks still remains an open challenge. In this paper, we investigate a generic perturbation-based method for visually explaining video understanding networks. Besides, we propose a novel loss function to enhance the method by constraining the smoothness of its results in both spatial and temporal dimensions. The method enables the comparison of explanation results between different network structures to become possible and can also avoid generating the pathological adversarial explanations for video inputs. Experimental comparison results verified the effectiveness of our method. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2397929262</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2397929262</sourcerecordid><originalsourceid>FETCH-proquest_journals_23979292623</originalsourceid><addsrcrecordid>eNqNiksKwjAUAIMgWLR3CLgu1Jd-7Foq4kJcVLclkqipIal5CdXbW8EDuBqYmQmJgLFVss4AZiRG7NI0haKEPGcR2Td24E4gPSsMXOs3rV-95soocxudkJaejJAOPTfi6w7SD9Y9kA7K3-lROh_chXtlzYJMr1yjjH-ck-W2bja7pHf2GST6trPBmTG1wKqyggoKYP9dHz1IPcY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2397929262</pqid></control><display><type>article</type><title>Towards Visually Explaining Video Understanding Networks with Perturbation</title><source>Publicly Available Content (ProQuest)</source><creator>Li, Zhenqiang ; Wang, Weimin ; Li, Zuoyue ; Huang, Yifei ; Sato, Yoichi</creator><creatorcontrib>Li, Zhenqiang ; Wang, Weimin ; Li, Zuoyue ; Huang, Yifei ; Sato, Yoichi</creatorcontrib><description>''Making black box models explainable'' is a vital problem that accompanies the development of deep learning networks. For networks taking visual information as input, one basic but challenging explanation method is to identify and visualize the input pixels/regions that dominate the network's prediction. However, most existing works focus on explaining networks taking a single image as input and do not consider the temporal relationship that exists in videos. Providing an easy-to-use visual explanation method that is applicable to diversified structures of video understanding networks still remains an open challenge. In this paper, we investigate a generic perturbation-based method for visually explaining video understanding networks. Besides, we propose a novel loss function to enhance the method by constraining the smoothness of its results in both spatial and temporal dimensions. The method enables the comparison of explanation results between different network structures to become possible and can also avoid generating the pathological adversarial explanations for video inputs. Experimental comparison results verified the effectiveness of our method.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Artificial neural networks ; Mathematical analysis ; Neural networks ; Perturbation</subject><ispartof>arXiv.org, 2020-11</ispartof><rights>2020. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2397929262?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Li, Zhenqiang</creatorcontrib><creatorcontrib>Wang, Weimin</creatorcontrib><creatorcontrib>Li, Zuoyue</creatorcontrib><creatorcontrib>Huang, Yifei</creatorcontrib><creatorcontrib>Sato, Yoichi</creatorcontrib><title>Towards Visually Explaining Video Understanding Networks with Perturbation</title><title>arXiv.org</title><description>''Making black box models explainable'' is a vital problem that accompanies the development of deep learning networks. For networks taking visual information as input, one basic but challenging explanation method is to identify and visualize the input pixels/regions that dominate the network's prediction. However, most existing works focus on explaining networks taking a single image as input and do not consider the temporal relationship that exists in videos. Providing an easy-to-use visual explanation method that is applicable to diversified structures of video understanding networks still remains an open challenge. In this paper, we investigate a generic perturbation-based method for visually explaining video understanding networks. Besides, we propose a novel loss function to enhance the method by constraining the smoothness of its results in both spatial and temporal dimensions. The method enables the comparison of explanation results between different network structures to become possible and can also avoid generating the pathological adversarial explanations for video inputs. Experimental comparison results verified the effectiveness of our method.</description><subject>Artificial neural networks</subject><subject>Mathematical analysis</subject><subject>Neural networks</subject><subject>Perturbation</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNiksKwjAUAIMgWLR3CLgu1Jd-7Foq4kJcVLclkqipIal5CdXbW8EDuBqYmQmJgLFVss4AZiRG7NI0haKEPGcR2Td24E4gPSsMXOs3rV-95soocxudkJaejJAOPTfi6w7SD9Y9kA7K3-lROh_chXtlzYJMr1yjjH-ck-W2bja7pHf2GST6trPBmTG1wKqyggoKYP9dHz1IPcY</recordid><startdate>20201109</startdate><enddate>20201109</enddate><creator>Li, Zhenqiang</creator><creator>Wang, Weimin</creator><creator>Li, Zuoyue</creator><creator>Huang, Yifei</creator><creator>Sato, Yoichi</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20201109</creationdate><title>Towards Visually Explaining Video Understanding Networks with Perturbation</title><author>Li, Zhenqiang ; Wang, Weimin ; Li, Zuoyue ; Huang, Yifei ; Sato, Yoichi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_23979292623</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Artificial neural networks</topic><topic>Mathematical analysis</topic><topic>Neural networks</topic><topic>Perturbation</topic><toplevel>online_resources</toplevel><creatorcontrib>Li, Zhenqiang</creatorcontrib><creatorcontrib>Wang, Weimin</creatorcontrib><creatorcontrib>Li, Zuoyue</creatorcontrib><creatorcontrib>Huang, Yifei</creatorcontrib><creatorcontrib>Sato, Yoichi</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Zhenqiang</au><au>Wang, Weimin</au><au>Li, Zuoyue</au><au>Huang, Yifei</au><au>Sato, Yoichi</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Towards Visually Explaining Video Understanding Networks with Perturbation</atitle><jtitle>arXiv.org</jtitle><date>2020-11-09</date><risdate>2020</risdate><eissn>2331-8422</eissn><abstract>''Making black box models explainable'' is a vital problem that accompanies the development of deep learning networks. For networks taking visual information as input, one basic but challenging explanation method is to identify and visualize the input pixels/regions that dominate the network's prediction. However, most existing works focus on explaining networks taking a single image as input and do not consider the temporal relationship that exists in videos. Providing an easy-to-use visual explanation method that is applicable to diversified structures of video understanding networks still remains an open challenge. In this paper, we investigate a generic perturbation-based method for visually explaining video understanding networks. Besides, we propose a novel loss function to enhance the method by constraining the smoothness of its results in both spatial and temporal dimensions. The method enables the comparison of explanation results between different network structures to become possible and can also avoid generating the pathological adversarial explanations for video inputs. Experimental comparison results verified the effectiveness of our method.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2020-11 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2397929262 |
source | Publicly Available Content (ProQuest) |
subjects | Artificial neural networks Mathematical analysis Neural networks Perturbation |
title | Towards Visually Explaining Video Understanding Networks with Perturbation |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T23%3A43%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Towards%20Visually%20Explaining%20Video%20Understanding%20Networks%20with%20Perturbation&rft.jtitle=arXiv.org&rft.au=Li,%20Zhenqiang&rft.date=2020-11-09&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2397929262%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_23979292623%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2397929262&rft_id=info:pmid/&rfr_iscdi=true |