Loading…

Inflated 3D ConvNet context analysis for violence detection

According to the Wall Street Journal, one billion surveillance cameras will be deployed around the world by 2021. This amount of information can be hardly managed by humans. Using a Inflated 3D ConvNet as backbone, this paper introduces a novel automatic violence detection approach that outperforms...

Full description

Saved in:
Bibliographic Details
Published in:Machine vision and applications 2022, Vol.33 (1), Article 15
Main Authors: Freire-Obregón, David, Barra, Paola, Castrillón-Santana, Modesto, Marsico, Maria De
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c363t-df79c3f375b36b512ca727bd82348877bf7b3d93106acb2b67546b875f9ad2313
cites cdi_FETCH-LOGICAL-c363t-df79c3f375b36b512ca727bd82348877bf7b3d93106acb2b67546b875f9ad2313
container_end_page
container_issue 1
container_start_page
container_title Machine vision and applications
container_volume 33
creator Freire-Obregón, David
Barra, Paola
Castrillón-Santana, Modesto
Marsico, Maria De
description According to the Wall Street Journal, one billion surveillance cameras will be deployed around the world by 2021. This amount of information can be hardly managed by humans. Using a Inflated 3D ConvNet as backbone, this paper introduces a novel automatic violence detection approach that outperforms state-of-the-art existing proposals. Most of those proposals consider a pre-processing step to only focus on some regions of interest in the scene, i.e., those actually containing a human subject. In this regard, this paper also reports the results of an extensive analysis on whether and how the context can affect or not the adopted classifier performance. The experiments show that context-free footage yields substantial deterioration of the classifier performance (2% to 5%) on publicly available datasets. However, they also demonstrate that performance stabilizes in context-free settings, no matter the level of context restriction applied. Finally, a cross-dataset experiment investigates the generalizability of results obtained in a single-collection experiment (same dataset used for training and testing) to cross-collection settings (different datasets used for training and testing).
doi_str_mv 10.1007/s00138-021-01264-9
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2615528985</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2615528985</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-df79c3f375b36b512ca727bd82348877bf7b3d93106acb2b67546b875f9ad2313</originalsourceid><addsrcrecordid>eNp9kEtLAzEUhYMoWKt_wNWA62gekxeupL4KRTe6DkkmkSljUpO02H_v6AjuXN2zON_h8gFwjtElRkhcFYQwlRARDBEmvIXqAMxwSwnEgqtDMENqzBIpcgxOSlkjhFoh2hm4XsYwmOq7ht42ixR3T742LsXqP2tjohn2pS9NSLnZ9Wnw0fmm89W72qd4Co6CGYo_-71z8Hp_97J4hKvnh-XiZgUd5bTCLgjlaKCCWcotw8QZQYTtJKGtlELYICztFMWIG2eJ5YK13ErBgjIdoZjOwcW0u8npY-tL1eu0zeNvRROOGSNSSTa2yNRyOZWSfdCb3L-bvNcY6W9JepKkR0n6R5JWI0QnqIzl-Obz3_Q_1Be-nWh-</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2615528985</pqid></control><display><type>article</type><title>Inflated 3D ConvNet context analysis for violence detection</title><source>Springer Link</source><creator>Freire-Obregón, David ; Barra, Paola ; Castrillón-Santana, Modesto ; Marsico, Maria De</creator><creatorcontrib>Freire-Obregón, David ; Barra, Paola ; Castrillón-Santana, Modesto ; Marsico, Maria De</creatorcontrib><description>According to the Wall Street Journal, one billion surveillance cameras will be deployed around the world by 2021. This amount of information can be hardly managed by humans. Using a Inflated 3D ConvNet as backbone, this paper introduces a novel automatic violence detection approach that outperforms state-of-the-art existing proposals. Most of those proposals consider a pre-processing step to only focus on some regions of interest in the scene, i.e., those actually containing a human subject. In this regard, this paper also reports the results of an extensive analysis on whether and how the context can affect or not the adopted classifier performance. The experiments show that context-free footage yields substantial deterioration of the classifier performance (2% to 5%) on publicly available datasets. However, they also demonstrate that performance stabilizes in context-free settings, no matter the level of context restriction applied. Finally, a cross-dataset experiment investigates the generalizability of results obtained in a single-collection experiment (same dataset used for training and testing) to cross-collection settings (different datasets used for training and testing).</description><identifier>ISSN: 0932-8092</identifier><identifier>EISSN: 1432-1769</identifier><identifier>DOI: 10.1007/s00138-021-01264-9</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Classifiers ; Communications Engineering ; Computer Science ; Context ; Datasets ; Image Processing and Computer Vision ; Networks ; Pattern Recognition ; Proposals ; Robotics and Intelligent Systems ; Special Issue on 25th ICPR - Computer Vision ; Special Issue Paper ; Training ; Violence ; Vision systems</subject><ispartof>Machine vision and applications, 2022, Vol.33 (1), Article 15</ispartof><rights>The Author(s) 2021</rights><rights>The Author(s) 2021. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c363t-df79c3f375b36b512ca727bd82348877bf7b3d93106acb2b67546b875f9ad2313</citedby><cites>FETCH-LOGICAL-c363t-df79c3f375b36b512ca727bd82348877bf7b3d93106acb2b67546b875f9ad2313</cites><orcidid>0000-0002-7692-0626</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Freire-Obregón, David</creatorcontrib><creatorcontrib>Barra, Paola</creatorcontrib><creatorcontrib>Castrillón-Santana, Modesto</creatorcontrib><creatorcontrib>Marsico, Maria De</creatorcontrib><title>Inflated 3D ConvNet context analysis for violence detection</title><title>Machine vision and applications</title><addtitle>Machine Vision and Applications</addtitle><description>According to the Wall Street Journal, one billion surveillance cameras will be deployed around the world by 2021. This amount of information can be hardly managed by humans. Using a Inflated 3D ConvNet as backbone, this paper introduces a novel automatic violence detection approach that outperforms state-of-the-art existing proposals. Most of those proposals consider a pre-processing step to only focus on some regions of interest in the scene, i.e., those actually containing a human subject. In this regard, this paper also reports the results of an extensive analysis on whether and how the context can affect or not the adopted classifier performance. The experiments show that context-free footage yields substantial deterioration of the classifier performance (2% to 5%) on publicly available datasets. However, they also demonstrate that performance stabilizes in context-free settings, no matter the level of context restriction applied. Finally, a cross-dataset experiment investigates the generalizability of results obtained in a single-collection experiment (same dataset used for training and testing) to cross-collection settings (different datasets used for training and testing).</description><subject>Classifiers</subject><subject>Communications Engineering</subject><subject>Computer Science</subject><subject>Context</subject><subject>Datasets</subject><subject>Image Processing and Computer Vision</subject><subject>Networks</subject><subject>Pattern Recognition</subject><subject>Proposals</subject><subject>Robotics and Intelligent Systems</subject><subject>Special Issue on 25th ICPR - Computer Vision</subject><subject>Special Issue Paper</subject><subject>Training</subject><subject>Violence</subject><subject>Vision systems</subject><issn>0932-8092</issn><issn>1432-1769</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp9kEtLAzEUhYMoWKt_wNWA62gekxeupL4KRTe6DkkmkSljUpO02H_v6AjuXN2zON_h8gFwjtElRkhcFYQwlRARDBEmvIXqAMxwSwnEgqtDMENqzBIpcgxOSlkjhFoh2hm4XsYwmOq7ht42ixR3T742LsXqP2tjohn2pS9NSLnZ9Wnw0fmm89W72qd4Co6CGYo_-71z8Hp_97J4hKvnh-XiZgUd5bTCLgjlaKCCWcotw8QZQYTtJKGtlELYICztFMWIG2eJ5YK13ErBgjIdoZjOwcW0u8npY-tL1eu0zeNvRROOGSNSSTa2yNRyOZWSfdCb3L-bvNcY6W9JepKkR0n6R5JWI0QnqIzl-Obz3_Q_1Be-nWh-</recordid><startdate>2022</startdate><enddate>2022</enddate><creator>Freire-Obregón, David</creator><creator>Barra, Paola</creator><creator>Castrillón-Santana, Modesto</creator><creator>Marsico, Maria De</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><orcidid>https://orcid.org/0000-0002-7692-0626</orcidid></search><sort><creationdate>2022</creationdate><title>Inflated 3D ConvNet context analysis for violence detection</title><author>Freire-Obregón, David ; Barra, Paola ; Castrillón-Santana, Modesto ; Marsico, Maria De</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-df79c3f375b36b512ca727bd82348877bf7b3d93106acb2b67546b875f9ad2313</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Classifiers</topic><topic>Communications Engineering</topic><topic>Computer Science</topic><topic>Context</topic><topic>Datasets</topic><topic>Image Processing and Computer Vision</topic><topic>Networks</topic><topic>Pattern Recognition</topic><topic>Proposals</topic><topic>Robotics and Intelligent Systems</topic><topic>Special Issue on 25th ICPR - Computer Vision</topic><topic>Special Issue Paper</topic><topic>Training</topic><topic>Violence</topic><topic>Vision systems</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Freire-Obregón, David</creatorcontrib><creatorcontrib>Barra, Paola</creatorcontrib><creatorcontrib>Castrillón-Santana, Modesto</creatorcontrib><creatorcontrib>Marsico, Maria De</creatorcontrib><collection>SpringerOpen</collection><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><jtitle>Machine vision and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Freire-Obregón, David</au><au>Barra, Paola</au><au>Castrillón-Santana, Modesto</au><au>Marsico, Maria De</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Inflated 3D ConvNet context analysis for violence detection</atitle><jtitle>Machine vision and applications</jtitle><stitle>Machine Vision and Applications</stitle><date>2022</date><risdate>2022</risdate><volume>33</volume><issue>1</issue><artnum>15</artnum><issn>0932-8092</issn><eissn>1432-1769</eissn><abstract>According to the Wall Street Journal, one billion surveillance cameras will be deployed around the world by 2021. This amount of information can be hardly managed by humans. Using a Inflated 3D ConvNet as backbone, this paper introduces a novel automatic violence detection approach that outperforms state-of-the-art existing proposals. Most of those proposals consider a pre-processing step to only focus on some regions of interest in the scene, i.e., those actually containing a human subject. In this regard, this paper also reports the results of an extensive analysis on whether and how the context can affect or not the adopted classifier performance. The experiments show that context-free footage yields substantial deterioration of the classifier performance (2% to 5%) on publicly available datasets. However, they also demonstrate that performance stabilizes in context-free settings, no matter the level of context restriction applied. Finally, a cross-dataset experiment investigates the generalizability of results obtained in a single-collection experiment (same dataset used for training and testing) to cross-collection settings (different datasets used for training and testing).</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s00138-021-01264-9</doi><orcidid>https://orcid.org/0000-0002-7692-0626</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0932-8092
ispartof Machine vision and applications, 2022, Vol.33 (1), Article 15
issn 0932-8092
1432-1769
language eng
recordid cdi_proquest_journals_2615528985
source Springer Link
subjects Classifiers
Communications Engineering
Computer Science
Context
Datasets
Image Processing and Computer Vision
Networks
Pattern Recognition
Proposals
Robotics and Intelligent Systems
Special Issue on 25th ICPR - Computer Vision
Special Issue Paper
Training
Violence
Vision systems
title Inflated 3D ConvNet context analysis for violence detection
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T20%3A48%3A57IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Inflated%203D%20ConvNet%20context%20analysis%20for%20violence%20detection&rft.jtitle=Machine%20vision%20and%20applications&rft.au=Freire-Obreg%C3%B3n,%20David&rft.date=2022&rft.volume=33&rft.issue=1&rft.artnum=15&rft.issn=0932-8092&rft.eissn=1432-1769&rft_id=info:doi/10.1007/s00138-021-01264-9&rft_dat=%3Cproquest_cross%3E2615528985%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c363t-df79c3f375b36b512ca727bd82348877bf7b3d93106acb2b67546b875f9ad2313%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2615528985&rft_id=info:pmid/&rfr_iscdi=true