Loading…

Towards Visually Explaining Video Understanding Networks with Perturbation

''Making black box models explainable'' is a vital problem that accompanies the development of deep learning networks. For networks taking visual information as input, one basic but challenging explanation method is to identify and visualize the input pixels/regions that dominate...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2020-11
Main Authors:	Li, Zhenqiang, Wang, Weimin, Li, Zuoyue, Huang, Yifei, Sato, Yoichi
Format:	Article
Language:	English
Subjects:	Artificial neural networks Mathematical analysis Neural networks Perturbation
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Li, Zhenqiang Wang, Weimin Li, Zuoyue Huang, Yifei Sato, Yoichi
description	''Making black box models explainable'' is a vital problem that accompanies the development of deep learning networks. For networks taking visual information as input, one basic but challenging explanation method is to identify and visualize the input pixels/regions that dominate the network's prediction. However, most existing works focus on explaining networks taking a single image as input and do not consider the temporal relationship that exists in videos. Providing an easy-to-use visual explanation method that is applicable to diversified structures of video understanding networks still remains an open challenge. In this paper, we investigate a generic perturbation-based method for visually explaining video understanding networks. Besides, we propose a novel loss function to enhance the method by constraining the smoothness of its results in both spatial and temporal dimensions. The method enables the comparison of explanation results between different network structures to become possible and can also avoid generating the pathological adversarial explanations for video inputs. Experimental comparison results verified the effectiveness of our method.
format	article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2397929262</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2397929262</sourcerecordid><originalsourceid>FETCH-proquest_journals_23979292623</originalsourceid><addsrcrecordid>eNqNiksKwjAUAIMgWLR3CLgu1Jd-7Foq4kJcVLclkqipIal5CdXbW8EDuBqYmQmJgLFVss4AZiRG7NI0haKEPGcR2Td24E4gPSsMXOs3rV-95soocxudkJaejJAOPTfi6w7SD9Y9kA7K3-lROh_chXtlzYJMr1yjjH-ck-W2bja7pHf2GST6trPBmTG1wKqyggoKYP9dHz1IPcY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2397929262</pqid></control><display><type>article</type><title>Towards Visually Explaining Video Understanding Networks with Perturbation</title><source>Publicly Available Content (ProQuest)</source><creator>Li, Zhenqiang ; Wang, Weimin ; Li, Zuoyue ; Huang, Yifei ; Sato, Yoichi</creator><creatorcontrib>Li, Zhenqiang ; Wang, Weimin ; Li, Zuoyue ; Huang, Yifei ; Sato, Yoichi</creatorcontrib><description>''Making black box models explainable'' is a vital problem that accompanies the development of deep learning networks. For networks taking visual information as input, one basic but challenging explanation method is to identify and visualize the input pixels/regions that dominate the network's prediction. However, most existing works focus on explaining networks taking a single image as input and do not consider the temporal relationship that exists in videos. Providing an easy-to-use visual explanation method that is applicable to diversified structures of video understanding networks still remains an open challenge. In this paper, we investigate a generic perturbation-based method for visually explaining video understanding networks. Besides, we propose a novel loss function to enhance the method by constraining the smoothness of its results in both spatial and temporal dimensions. The method enables the comparison of explanation results between different network structures to become possible and can also avoid generating the pathological adversarial explanations for video inputs. Experimental comparison results verified the effectiveness of our method.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Artificial neural networks ; Mathematical analysis ; Neural networks ; Perturbation</subject><ispartof>arXiv.org, 2020-11</ispartof><rights>2020. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2397929262?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Li, Zhenqiang</creatorcontrib><creatorcontrib>Wang, Weimin</creatorcontrib><creatorcontrib>Li, Zuoyue</creatorcontrib><creatorcontrib>Huang, Yifei</creatorcontrib><creatorcontrib>Sato, Yoichi</creatorcontrib><title>Towards Visually Explaining Video Understanding Networks with Perturbation</title><title>arXiv.org</title><description>''Making black box models explainable'' is a vital problem that accompanies the development of deep learning networks. For networks taking visual information as input, one basic but challenging explanation method is to identify and visualize the input pixels/regions that dominate the network's prediction. However, most existing works focus on explaining networks taking a single image as input and do not consider the temporal relationship that exists in videos. Providing an easy-to-use visual explanation method that is applicable to diversified structures of video understanding networks still remains an open challenge. In this paper, we investigate a generic perturbation-based method for visually explaining video understanding networks. Besides, we propose a novel loss function to enhance the method by constraining the smoothness of its results in both spatial and temporal dimensions. The method enables the comparison of explanation results between different network structures to become possible and can also avoid generating the pathological adversarial explanations for video inputs. Experimental comparison results verified the effectiveness of our method.</description><subject>Artificial neural networks</subject><subject>Mathematical analysis</subject><subject>Neural networks</subject><subject>Perturbation</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNiksKwjAUAIMgWLR3CLgu1Jd-7Foq4kJcVLclkqipIal5CdXbW8EDuBqYmQmJgLFVss4AZiRG7NI0haKEPGcR2Td24E4gPSsMXOs3rV-95soocxudkJaejJAOPTfi6w7SD9Y9kA7K3-lROh_chXtlzYJMr1yjjH-ck-W2bja7pHf2GST6trPBmTG1wKqyggoKYP9dHz1IPcY</recordid><startdate>20201109</startdate><enddate>20201109</enddate><creator>Li, Zhenqiang</creator><creator>Wang, Weimin</creator><creator>Li, Zuoyue</creator><creator>Huang, Yifei</creator><creator>Sato, Yoichi</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20201109</creationdate><title>Towards Visually Explaining Video Understanding Networks with Perturbation</title><author>Li, Zhenqiang ; Wang, Weimin ; Li, Zuoyue ; Huang, Yifei ; Sato, Yoichi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_23979292623</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Artificial neural networks</topic><topic>Mathematical analysis</topic><topic>Neural networks</topic><topic>Perturbation</topic><toplevel>online_resources</toplevel><creatorcontrib>Li, Zhenqiang</creatorcontrib><creatorcontrib>Wang, Weimin</creatorcontrib><creatorcontrib>Li, Zuoyue</creatorcontrib><creatorcontrib>Huang, Yifei</creatorcontrib><creatorcontrib>Sato, Yoichi</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Zhenqiang</au><au>Wang, Weimin</au><au>Li, Zuoyue</au><au>Huang, Yifei</au><au>Sato, Yoichi</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Towards Visually Explaining Video Understanding Networks with Perturbation</atitle><jtitle>arXiv.org</jtitle><date>2020-11-09</date><risdate>2020</risdate><eissn>2331-8422</eissn><abstract>''Making black box models explainable'' is a vital problem that accompanies the development of deep learning networks. For networks taking visual information as input, one basic but challenging explanation method is to identify and visualize the input pixels/regions that dominate the network's prediction. However, most existing works focus on explaining networks taking a single image as input and do not consider the temporal relationship that exists in videos. Providing an easy-to-use visual explanation method that is applicable to diversified structures of video understanding networks still remains an open challenge. In this paper, we investigate a generic perturbation-based method for visually explaining video understanding networks. Besides, we propose a novel loss function to enhance the method by constraining the smoothness of its results in both spatial and temporal dimensions. The method enables the comparison of explanation results between different network structures to become possible and can also avoid generating the pathological adversarial explanations for video inputs. Experimental comparison results verified the effectiveness of our method.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2020-11
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2397929262
source	Publicly Available Content (ProQuest)
subjects	Artificial neural networks Mathematical analysis Neural networks Perturbation
title	Towards Visually Explaining Video Understanding Networks with Perturbation
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T23%3A43%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Towards%20Visually%20Explaining%20Video%20Understanding%20Networks%20with%20Perturbation&rft.jtitle=arXiv.org&rft.au=Li,%20Zhenqiang&rft.date=2020-11-09&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2397929262%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_23979292623%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2397929262&rft_id=info:pmid/&rfr_iscdi=true