Loading…

Scale and density invariant head detection deep model for crowd counting in pedestrian crowds

Crowd counting in high density crowds has significant importance in crowd safety and crowd management. Existing state-of-the-art methods employ regression models to count the number of people in an image. However, regression models are blind and cannot localize the individuals in the scene. On the o...

Full description

Saved in:
Bibliographic Details
Published in:The Visual computer 2021-08, Vol.37 (8), p.2127-2137
Main Authors: Khan, Sultan Daud, Basalamah, Saleh
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c319t-6b2c226362105d7940bbbd741c5cc07c80b1f2a624d77527fafc9c012220f1e83
cites cdi_FETCH-LOGICAL-c319t-6b2c226362105d7940bbbd741c5cc07c80b1f2a624d77527fafc9c012220f1e83
container_end_page 2137
container_issue 8
container_start_page 2127
container_title The Visual computer
container_volume 37
creator Khan, Sultan Daud
Basalamah, Saleh
description Crowd counting in high density crowds has significant importance in crowd safety and crowd management. Existing state-of-the-art methods employ regression models to count the number of people in an image. However, regression models are blind and cannot localize the individuals in the scene. On the other hand, detection-based crowd counting in high density crowds is a challenging problem due to significant variations in scales, poses and appearances. The variations in poses and appearances can be handled through large capacity convolutional neural networks. However, the problem of scale lies in the heart of every detector and needs to be addressed for effective crowd counting. In this paper, we propose a end-to-end scale invariant head detection framework that can handle broad range of scales. We demonstrate that scale variations can be handled by modeling a set of specialized scale-specific convolutional neural networks with different receptive fields. These scale-specific detectors are combined into a single backbone network, where parameters of the network is optimized in end-to-end fashion. We evaluated our framework on challenging benchmark datasets, i.e., UCF-QNRF, UCSD. From experiment results, we demonstrate that proposed framework beats existing methods by a great margin.
doi_str_mv 10.1007/s00371-020-01974-7
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2917992667</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2917992667</sourcerecordid><originalsourceid>FETCH-LOGICAL-c319t-6b2c226362105d7940bbbd741c5cc07c80b1f2a624d77527fafc9c012220f1e83</originalsourceid><addsrcrecordid>eNp9kE1LAzEQhoMoWKt_wFPAc3Qy-5HNUYpaoeBBPUrIJtm6pc2uyVbpvzfrCt48zdf7zAwvIZccrjmAuIkAmeAMEBhwKXImjsiM5xkyzHhxTGbARcVQVPKUnMW4gVSLXM7I27PRW0e1t9Q6H9vhQFv_qUOr_UDfnR7bgzND2_mUuZ7uOuu2tOkCNaH7stR0ez-0fp0w2jvr4jCy0zCek5NGb6O7-I1z8np_97JYstXTw-PidsVMxuXAyhoNYpmVyKGwQuZQ17UVOTeFMSBMBTVvUJeYWyEKFI1ujDTAEREa7qpsTq6mvX3oPvbpB7Xp9sGnkwolF1JiWYqkwkmVnosxuEb1od3pcFAc1GijmmxUyUb1Y6MaoWyCYhL7tQt_q_-hvgHRTnV1</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2917992667</pqid></control><display><type>article</type><title>Scale and density invariant head detection deep model for crowd counting in pedestrian crowds</title><source>Springer Link</source><creator>Khan, Sultan Daud ; Basalamah, Saleh</creator><creatorcontrib>Khan, Sultan Daud ; Basalamah, Saleh</creatorcontrib><description>Crowd counting in high density crowds has significant importance in crowd safety and crowd management. Existing state-of-the-art methods employ regression models to count the number of people in an image. However, regression models are blind and cannot localize the individuals in the scene. On the other hand, detection-based crowd counting in high density crowds is a challenging problem due to significant variations in scales, poses and appearances. The variations in poses and appearances can be handled through large capacity convolutional neural networks. However, the problem of scale lies in the heart of every detector and needs to be addressed for effective crowd counting. In this paper, we propose a end-to-end scale invariant head detection framework that can handle broad range of scales. We demonstrate that scale variations can be handled by modeling a set of specialized scale-specific convolutional neural networks with different receptive fields. These scale-specific detectors are combined into a single backbone network, where parameters of the network is optimized in end-to-end fashion. We evaluated our framework on challenging benchmark datasets, i.e., UCF-QNRF, UCSD. From experiment results, we demonstrate that proposed framework beats existing methods by a great margin.</description><identifier>ISSN: 0178-2789</identifier><identifier>EISSN: 1432-2315</identifier><identifier>DOI: 10.1007/s00371-020-01974-7</identifier><language>eng</language><publisher>Berlin/Heidelberg: Springer Berlin Heidelberg</publisher><subject>Artificial Intelligence ; Artificial neural networks ; Cameras ; Computer Graphics ; Computer networks ; Computer Science ; High density ; Image Processing and Computer Vision ; Invariants ; Methods ; Neural networks ; Original Article ; Pedestrians ; Regression models ; Safety management ; Sensors</subject><ispartof>The Visual computer, 2021-08, Vol.37 (8), p.2127-2137</ispartof><rights>Springer-Verlag GmbH Germany, part of Springer Nature 2020</rights><rights>Springer-Verlag GmbH Germany, part of Springer Nature 2020.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c319t-6b2c226362105d7940bbbd741c5cc07c80b1f2a624d77527fafc9c012220f1e83</citedby><cites>FETCH-LOGICAL-c319t-6b2c226362105d7940bbbd741c5cc07c80b1f2a624d77527fafc9c012220f1e83</cites><orcidid>0000-0002-7406-8441</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Khan, Sultan Daud</creatorcontrib><creatorcontrib>Basalamah, Saleh</creatorcontrib><title>Scale and density invariant head detection deep model for crowd counting in pedestrian crowds</title><title>The Visual computer</title><addtitle>Vis Comput</addtitle><description>Crowd counting in high density crowds has significant importance in crowd safety and crowd management. Existing state-of-the-art methods employ regression models to count the number of people in an image. However, regression models are blind and cannot localize the individuals in the scene. On the other hand, detection-based crowd counting in high density crowds is a challenging problem due to significant variations in scales, poses and appearances. The variations in poses and appearances can be handled through large capacity convolutional neural networks. However, the problem of scale lies in the heart of every detector and needs to be addressed for effective crowd counting. In this paper, we propose a end-to-end scale invariant head detection framework that can handle broad range of scales. We demonstrate that scale variations can be handled by modeling a set of specialized scale-specific convolutional neural networks with different receptive fields. These scale-specific detectors are combined into a single backbone network, where parameters of the network is optimized in end-to-end fashion. We evaluated our framework on challenging benchmark datasets, i.e., UCF-QNRF, UCSD. From experiment results, we demonstrate that proposed framework beats existing methods by a great margin.</description><subject>Artificial Intelligence</subject><subject>Artificial neural networks</subject><subject>Cameras</subject><subject>Computer Graphics</subject><subject>Computer networks</subject><subject>Computer Science</subject><subject>High density</subject><subject>Image Processing and Computer Vision</subject><subject>Invariants</subject><subject>Methods</subject><subject>Neural networks</subject><subject>Original Article</subject><subject>Pedestrians</subject><subject>Regression models</subject><subject>Safety management</subject><subject>Sensors</subject><issn>0178-2789</issn><issn>1432-2315</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LAzEQhoMoWKt_wFPAc3Qy-5HNUYpaoeBBPUrIJtm6pc2uyVbpvzfrCt48zdf7zAwvIZccrjmAuIkAmeAMEBhwKXImjsiM5xkyzHhxTGbARcVQVPKUnMW4gVSLXM7I27PRW0e1t9Q6H9vhQFv_qUOr_UDfnR7bgzND2_mUuZ7uOuu2tOkCNaH7stR0ez-0fp0w2jvr4jCy0zCek5NGb6O7-I1z8np_97JYstXTw-PidsVMxuXAyhoNYpmVyKGwQuZQ17UVOTeFMSBMBTVvUJeYWyEKFI1ujDTAEREa7qpsTq6mvX3oPvbpB7Xp9sGnkwolF1JiWYqkwkmVnosxuEb1od3pcFAc1GijmmxUyUb1Y6MaoWyCYhL7tQt_q_-hvgHRTnV1</recordid><startdate>20210801</startdate><enddate>20210801</enddate><creator>Khan, Sultan Daud</creator><creator>Basalamah, Saleh</creator><general>Springer Berlin Heidelberg</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><orcidid>https://orcid.org/0000-0002-7406-8441</orcidid></search><sort><creationdate>20210801</creationdate><title>Scale and density invariant head detection deep model for crowd counting in pedestrian crowds</title><author>Khan, Sultan Daud ; Basalamah, Saleh</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c319t-6b2c226362105d7940bbbd741c5cc07c80b1f2a624d77527fafc9c012220f1e83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Artificial Intelligence</topic><topic>Artificial neural networks</topic><topic>Cameras</topic><topic>Computer Graphics</topic><topic>Computer networks</topic><topic>Computer Science</topic><topic>High density</topic><topic>Image Processing and Computer Vision</topic><topic>Invariants</topic><topic>Methods</topic><topic>Neural networks</topic><topic>Original Article</topic><topic>Pedestrians</topic><topic>Regression models</topic><topic>Safety management</topic><topic>Sensors</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Khan, Sultan Daud</creatorcontrib><creatorcontrib>Basalamah, Saleh</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>The Visual computer</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Khan, Sultan Daud</au><au>Basalamah, Saleh</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Scale and density invariant head detection deep model for crowd counting in pedestrian crowds</atitle><jtitle>The Visual computer</jtitle><stitle>Vis Comput</stitle><date>2021-08-01</date><risdate>2021</risdate><volume>37</volume><issue>8</issue><spage>2127</spage><epage>2137</epage><pages>2127-2137</pages><issn>0178-2789</issn><eissn>1432-2315</eissn><abstract>Crowd counting in high density crowds has significant importance in crowd safety and crowd management. Existing state-of-the-art methods employ regression models to count the number of people in an image. However, regression models are blind and cannot localize the individuals in the scene. On the other hand, detection-based crowd counting in high density crowds is a challenging problem due to significant variations in scales, poses and appearances. The variations in poses and appearances can be handled through large capacity convolutional neural networks. However, the problem of scale lies in the heart of every detector and needs to be addressed for effective crowd counting. In this paper, we propose a end-to-end scale invariant head detection framework that can handle broad range of scales. We demonstrate that scale variations can be handled by modeling a set of specialized scale-specific convolutional neural networks with different receptive fields. These scale-specific detectors are combined into a single backbone network, where parameters of the network is optimized in end-to-end fashion. We evaluated our framework on challenging benchmark datasets, i.e., UCF-QNRF, UCSD. From experiment results, we demonstrate that proposed framework beats existing methods by a great margin.</abstract><cop>Berlin/Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/s00371-020-01974-7</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0002-7406-8441</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0178-2789
ispartof The Visual computer, 2021-08, Vol.37 (8), p.2127-2137
issn 0178-2789
1432-2315
language eng
recordid cdi_proquest_journals_2917992667
source Springer Link
subjects Artificial Intelligence
Artificial neural networks
Cameras
Computer Graphics
Computer networks
Computer Science
High density
Image Processing and Computer Vision
Invariants
Methods
Neural networks
Original Article
Pedestrians
Regression models
Safety management
Sensors
title Scale and density invariant head detection deep model for crowd counting in pedestrian crowds
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T08%3A05%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Scale%20and%20density%20invariant%20head%20detection%20deep%20model%20for%20crowd%20counting%20in%20pedestrian%20crowds&rft.jtitle=The%20Visual%20computer&rft.au=Khan,%20Sultan%20Daud&rft.date=2021-08-01&rft.volume=37&rft.issue=8&rft.spage=2127&rft.epage=2137&rft.pages=2127-2137&rft.issn=0178-2789&rft.eissn=1432-2315&rft_id=info:doi/10.1007/s00371-020-01974-7&rft_dat=%3Cproquest_cross%3E2917992667%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c319t-6b2c226362105d7940bbbd741c5cc07c80b1f2a624d77527fafc9c012220f1e83%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2917992667&rft_id=info:pmid/&rfr_iscdi=true