Loading…

Anomaly Detection in Aerial Videos with Transformers

Unmanned aerial vehicles (UAVs) are widely applied for purposes of inspection, search, and rescue operations by the virtue of low-cost, large-coverage, real-time, and high-resolution data acquisition capacities. Massive volumes of aerial videos are produced in these processes, in which normal events...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2022-09
Main Authors: Pu, Jin, Mou, Lichao, Gui-Song, Xia, Xiao Xiang Zhu
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Pu, Jin
Mou, Lichao
Gui-Song, Xia
Xiao Xiang Zhu
description Unmanned aerial vehicles (UAVs) are widely applied for purposes of inspection, search, and rescue operations by the virtue of low-cost, large-coverage, real-time, and high-resolution data acquisition capacities. Massive volumes of aerial videos are produced in these processes, in which normal events often account for an overwhelming proportion. It is extremely difficult to localize and extract abnormal events containing potentially valuable information from long video streams manually. Therefore, we are dedicated to developing anomaly detection methods to solve this issue. In this paper, we create a new dataset, named DroneAnomaly, for anomaly detection in aerial videos. This dataset provides 37 training video sequences and 22 testing video sequences from 7 different realistic scenes with various anomalous events. There are 87,488 color video frames (51,635 for training and 35,853 for testing) with the size of \(640 \times 640\) at 30 frames per second. Based on this dataset, we evaluate existing methods and offer a benchmark for this task. Furthermore, we present a new baseline model, ANomaly Detection with Transformers (ANDT), which treats consecutive video frames as a sequence of tubelets, utilizes a Transformer encoder to learn feature representations from the sequence, and leverages a decoder to predict the next frame. Our network models normality in the training phase and identifies an event with unpredictable temporal dynamics as an anomaly in the test phase. Moreover, To comprehensively evaluate the performance of our proposed method, we use not only our Drone-Anomaly dataset but also another dataset. We will make our dataset and code publicly available. A demo video is available at https://youtu.be/ancczYryOBY. We make our dataset and code publicly available .
doi_str_mv 10.48550/arxiv.2209.13363
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2718739500</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2718739500</sourcerecordid><originalsourceid>FETCH-LOGICAL-a953-7483ecc8280e826d76aa6fbb58dadd8fbe8293ba0391420f875e56a90e2969423</originalsourceid><addsrcrecordid>eNotjclqwzAUAEWhkJDmA3oT9Gz3-T2tR5OuEOjF9BpkWyYKjtVKTpe_b6A5DcxhhrHbCkphpIR7l37CV4kItqyIFF2xJRJVhRGIC7bO-QAAqDRKSUsm6ike3fjLH_zsuznEiYeJ1z4FN_L30PuY-XeY97xJbspDTEef8g27HtyY_frCFWueHpvNS7F9e37d1NvCWUmFFoZ81xk04A2qXivn1NC20vSu783Qnq2l1gHZSiAMRksvlbPg0SorkFbs7j_7keLnyed5d4inNJ2PO9SV0WQlAP0BTFRF_A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2718739500</pqid></control><display><type>article</type><title>Anomaly Detection in Aerial Videos with Transformers</title><source>Publicly Available Content Database</source><creator>Pu, Jin ; Mou, Lichao ; Gui-Song, Xia ; Xiao Xiang Zhu</creator><creatorcontrib>Pu, Jin ; Mou, Lichao ; Gui-Song, Xia ; Xiao Xiang Zhu</creatorcontrib><description>Unmanned aerial vehicles (UAVs) are widely applied for purposes of inspection, search, and rescue operations by the virtue of low-cost, large-coverage, real-time, and high-resolution data acquisition capacities. Massive volumes of aerial videos are produced in these processes, in which normal events often account for an overwhelming proportion. It is extremely difficult to localize and extract abnormal events containing potentially valuable information from long video streams manually. Therefore, we are dedicated to developing anomaly detection methods to solve this issue. In this paper, we create a new dataset, named DroneAnomaly, for anomaly detection in aerial videos. This dataset provides 37 training video sequences and 22 testing video sequences from 7 different realistic scenes with various anomalous events. There are 87,488 color video frames (51,635 for training and 35,853 for testing) with the size of \(640 \times 640\) at 30 frames per second. Based on this dataset, we evaluate existing methods and offer a benchmark for this task. Furthermore, we present a new baseline model, ANomaly Detection with Transformers (ANDT), which treats consecutive video frames as a sequence of tubelets, utilizes a Transformer encoder to learn feature representations from the sequence, and leverages a decoder to predict the next frame. Our network models normality in the training phase and identifies an event with unpredictable temporal dynamics as an anomaly in the test phase. Moreover, To comprehensively evaluate the performance of our proposed method, we use not only our Drone-Anomaly dataset but also another dataset. We will make our dataset and code publicly available. A demo video is available at https://youtu.be/ancczYryOBY. We make our dataset and code publicly available .</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2209.13363</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Anomalies ; Coders ; Data acquisition ; Datasets ; Frames (data processing) ; Frames per second ; Inspection ; Performance evaluation ; Rescue operations ; Training ; Transformers ; Unmanned aerial vehicles ; Video data</subject><ispartof>arXiv.org, 2022-09</ispartof><rights>2022. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2718739500?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,27925,37012,44590</link.rule.ids></links><search><creatorcontrib>Pu, Jin</creatorcontrib><creatorcontrib>Mou, Lichao</creatorcontrib><creatorcontrib>Gui-Song, Xia</creatorcontrib><creatorcontrib>Xiao Xiang Zhu</creatorcontrib><title>Anomaly Detection in Aerial Videos with Transformers</title><title>arXiv.org</title><description>Unmanned aerial vehicles (UAVs) are widely applied for purposes of inspection, search, and rescue operations by the virtue of low-cost, large-coverage, real-time, and high-resolution data acquisition capacities. Massive volumes of aerial videos are produced in these processes, in which normal events often account for an overwhelming proportion. It is extremely difficult to localize and extract abnormal events containing potentially valuable information from long video streams manually. Therefore, we are dedicated to developing anomaly detection methods to solve this issue. In this paper, we create a new dataset, named DroneAnomaly, for anomaly detection in aerial videos. This dataset provides 37 training video sequences and 22 testing video sequences from 7 different realistic scenes with various anomalous events. There are 87,488 color video frames (51,635 for training and 35,853 for testing) with the size of \(640 \times 640\) at 30 frames per second. Based on this dataset, we evaluate existing methods and offer a benchmark for this task. Furthermore, we present a new baseline model, ANomaly Detection with Transformers (ANDT), which treats consecutive video frames as a sequence of tubelets, utilizes a Transformer encoder to learn feature representations from the sequence, and leverages a decoder to predict the next frame. Our network models normality in the training phase and identifies an event with unpredictable temporal dynamics as an anomaly in the test phase. Moreover, To comprehensively evaluate the performance of our proposed method, we use not only our Drone-Anomaly dataset but also another dataset. We will make our dataset and code publicly available. A demo video is available at https://youtu.be/ancczYryOBY. We make our dataset and code publicly available .</description><subject>Anomalies</subject><subject>Coders</subject><subject>Data acquisition</subject><subject>Datasets</subject><subject>Frames (data processing)</subject><subject>Frames per second</subject><subject>Inspection</subject><subject>Performance evaluation</subject><subject>Rescue operations</subject><subject>Training</subject><subject>Transformers</subject><subject>Unmanned aerial vehicles</subject><subject>Video data</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNotjclqwzAUAEWhkJDmA3oT9Gz3-T2tR5OuEOjF9BpkWyYKjtVKTpe_b6A5DcxhhrHbCkphpIR7l37CV4kItqyIFF2xJRJVhRGIC7bO-QAAqDRKSUsm6ike3fjLH_zsuznEiYeJ1z4FN_L30PuY-XeY97xJbspDTEef8g27HtyY_frCFWueHpvNS7F9e37d1NvCWUmFFoZ81xk04A2qXivn1NC20vSu783Qnq2l1gHZSiAMRksvlbPg0SorkFbs7j_7keLnyed5d4inNJ2PO9SV0WQlAP0BTFRF_A</recordid><startdate>20220925</startdate><enddate>20220925</enddate><creator>Pu, Jin</creator><creator>Mou, Lichao</creator><creator>Gui-Song, Xia</creator><creator>Xiao Xiang Zhu</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20220925</creationdate><title>Anomaly Detection in Aerial Videos with Transformers</title><author>Pu, Jin ; Mou, Lichao ; Gui-Song, Xia ; Xiao Xiang Zhu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a953-7483ecc8280e826d76aa6fbb58dadd8fbe8293ba0391420f875e56a90e2969423</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Anomalies</topic><topic>Coders</topic><topic>Data acquisition</topic><topic>Datasets</topic><topic>Frames (data processing)</topic><topic>Frames per second</topic><topic>Inspection</topic><topic>Performance evaluation</topic><topic>Rescue operations</topic><topic>Training</topic><topic>Transformers</topic><topic>Unmanned aerial vehicles</topic><topic>Video data</topic><toplevel>online_resources</toplevel><creatorcontrib>Pu, Jin</creatorcontrib><creatorcontrib>Mou, Lichao</creatorcontrib><creatorcontrib>Gui-Song, Xia</creatorcontrib><creatorcontrib>Xiao Xiang Zhu</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Pu, Jin</au><au>Mou, Lichao</au><au>Gui-Song, Xia</au><au>Xiao Xiang Zhu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Anomaly Detection in Aerial Videos with Transformers</atitle><jtitle>arXiv.org</jtitle><date>2022-09-25</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Unmanned aerial vehicles (UAVs) are widely applied for purposes of inspection, search, and rescue operations by the virtue of low-cost, large-coverage, real-time, and high-resolution data acquisition capacities. Massive volumes of aerial videos are produced in these processes, in which normal events often account for an overwhelming proportion. It is extremely difficult to localize and extract abnormal events containing potentially valuable information from long video streams manually. Therefore, we are dedicated to developing anomaly detection methods to solve this issue. In this paper, we create a new dataset, named DroneAnomaly, for anomaly detection in aerial videos. This dataset provides 37 training video sequences and 22 testing video sequences from 7 different realistic scenes with various anomalous events. There are 87,488 color video frames (51,635 for training and 35,853 for testing) with the size of \(640 \times 640\) at 30 frames per second. Based on this dataset, we evaluate existing methods and offer a benchmark for this task. Furthermore, we present a new baseline model, ANomaly Detection with Transformers (ANDT), which treats consecutive video frames as a sequence of tubelets, utilizes a Transformer encoder to learn feature representations from the sequence, and leverages a decoder to predict the next frame. Our network models normality in the training phase and identifies an event with unpredictable temporal dynamics as an anomaly in the test phase. Moreover, To comprehensively evaluate the performance of our proposed method, we use not only our Drone-Anomaly dataset but also another dataset. We will make our dataset and code publicly available. A demo video is available at https://youtu.be/ancczYryOBY. We make our dataset and code publicly available .</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2209.13363</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2022-09
issn 2331-8422
language eng
recordid cdi_proquest_journals_2718739500
source Publicly Available Content Database
subjects Anomalies
Coders
Data acquisition
Datasets
Frames (data processing)
Frames per second
Inspection
Performance evaluation
Rescue operations
Training
Transformers
Unmanned aerial vehicles
Video data
title Anomaly Detection in Aerial Videos with Transformers
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T04%3A07%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Anomaly%20Detection%20in%20Aerial%20Videos%20with%20Transformers&rft.jtitle=arXiv.org&rft.au=Pu,%20Jin&rft.date=2022-09-25&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2209.13363&rft_dat=%3Cproquest%3E2718739500%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a953-7483ecc8280e826d76aa6fbb58dadd8fbe8293ba0391420f875e56a90e2969423%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2718739500&rft_id=info:pmid/&rfr_iscdi=true