Loading…

Video semantic segmentation with low latency

Recent advances in computer vision and deep learning algorithms have yielded intriguing results. It can perform tasks previously requiring human eyes and brains. Semantic video segmentation for autonomous cars is difficult due to the high cost, low latency, and performance requirements of convolutio...

Full description

Saved in:
Bibliographic Details
Published in:Telkomnika 2024-10, Vol.22 (5), p.1147-1156
Main Authors: Gowda, D V Channappa, Kanagavalli, R
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 1156
container_issue 5
container_start_page 1147
container_title Telkomnika
container_volume 22
creator Gowda, D V Channappa
Kanagavalli, R
description Recent advances in computer vision and deep learning algorithms have yielded intriguing results. It can perform tasks previously requiring human eyes and brains. Semantic video segmentation for autonomous cars is difficult due to the high cost, low latency, and performance requirements of convolutional neural networks (CNNs). Deep learning architectures like SegNet and FlowNet 2.0 on the Cambridge-driving labeled video database (CamVid) dataset enable low-latency pixel-wise semantic segmentation of video features. Because it uses SegNet and FlowNet topologies, it is ideal for practical applications. The decision network chooses an optical flow or segmentation network for an image frame based on the expected confidence score. Combining this decision-making method with adaptive scheduling of the key frame approach can speed up the process. ResNet50 SegNet has a "54.27%" mean intersection over union (MIoU) and a "19.57" average FPS. In addition to decision network and adaptive key frame sequencing, FlowNet2.0 increased graphics processing unit (GPU) frame processing per second to "30.19" with a MIoU of "47.65%". The GPU is used "47.65%" of the time. This performance gain illustrates that the video semantic segmentation network is faster without sacrificing quality.
doi_str_mv 10.12928/TELKOMNIKA.v22i5.25157
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3115794979</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3115794979</sourcerecordid><originalsourceid>FETCH-proquest_journals_31157949793</originalsourceid><addsrcrecordid>eNqNjs0KgkAcxJcoSMpnSOiatvvfTPcYYRR9XaSriG21orvlrklv3x56gAaG-cHMYRCaEBwQYBDP0-SwPx9Pu_0qeAOIMICQhFEPOUAx-AwY7SOHLBn1rfEQuVqX2CrCELLYQbOLuHLlaV7n0ojCwr3m0uRGKOl1wjy8SnVelRsui88YDW55pbn7yxGabpJ0vfWfjXq1XJusVG0jbZVRYm-wBYsY_W_1Bc2dO6U</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3115794979</pqid></control><display><type>article</type><title>Video semantic segmentation with low latency</title><source>Publicly Available Content Database</source><creator>Gowda, D V Channappa ; Kanagavalli, R</creator><creatorcontrib>Gowda, D V Channappa ; Kanagavalli, R</creatorcontrib><description>Recent advances in computer vision and deep learning algorithms have yielded intriguing results. It can perform tasks previously requiring human eyes and brains. Semantic video segmentation for autonomous cars is difficult due to the high cost, low latency, and performance requirements of convolutional neural networks (CNNs). Deep learning architectures like SegNet and FlowNet 2.0 on the Cambridge-driving labeled video database (CamVid) dataset enable low-latency pixel-wise semantic segmentation of video features. Because it uses SegNet and FlowNet topologies, it is ideal for practical applications. The decision network chooses an optical flow or segmentation network for an image frame based on the expected confidence score. Combining this decision-making method with adaptive scheduling of the key frame approach can speed up the process. ResNet50 SegNet has a "54.27%" mean intersection over union (MIoU) and a "19.57" average FPS. In addition to decision network and adaptive key frame sequencing, FlowNet2.0 increased graphics processing unit (GPU) frame processing per second to "30.19" with a MIoU of "47.65%". The GPU is used "47.65%" of the time. This performance gain illustrates that the video semantic segmentation network is faster without sacrificing quality.</description><identifier>ISSN: 1693-6930</identifier><identifier>EISSN: 2302-9293</identifier><identifier>DOI: 10.12928/TELKOMNIKA.v22i5.25157</identifier><language>eng</language><publisher>Yogyakarta: Ahmad Dahlan University</publisher><subject>Algorithms ; Artificial neural networks ; Autonomous cars ; Classification ; Computer vision ; Decision making ; Deep learning ; Flow nets ; Graphics processing units ; Human engineering ; Human performance ; Image segmentation ; Machine learning ; Network latency ; Network topologies ; Neural networks ; Optical flow (image analysis) ; Semantic segmentation ; Semantics ; Topology</subject><ispartof>Telkomnika, 2024-10, Vol.22 (5), p.1147-1156</ispartof><rights>2024. This work is published under https://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/3115794979/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/3115794979?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,25753,27924,27925,37012,44590,75126</link.rule.ids></links><search><creatorcontrib>Gowda, D V Channappa</creatorcontrib><creatorcontrib>Kanagavalli, R</creatorcontrib><title>Video semantic segmentation with low latency</title><title>Telkomnika</title><description>Recent advances in computer vision and deep learning algorithms have yielded intriguing results. It can perform tasks previously requiring human eyes and brains. Semantic video segmentation for autonomous cars is difficult due to the high cost, low latency, and performance requirements of convolutional neural networks (CNNs). Deep learning architectures like SegNet and FlowNet 2.0 on the Cambridge-driving labeled video database (CamVid) dataset enable low-latency pixel-wise semantic segmentation of video features. Because it uses SegNet and FlowNet topologies, it is ideal for practical applications. The decision network chooses an optical flow or segmentation network for an image frame based on the expected confidence score. Combining this decision-making method with adaptive scheduling of the key frame approach can speed up the process. ResNet50 SegNet has a "54.27%" mean intersection over union (MIoU) and a "19.57" average FPS. In addition to decision network and adaptive key frame sequencing, FlowNet2.0 increased graphics processing unit (GPU) frame processing per second to "30.19" with a MIoU of "47.65%". The GPU is used "47.65%" of the time. This performance gain illustrates that the video semantic segmentation network is faster without sacrificing quality.</description><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Autonomous cars</subject><subject>Classification</subject><subject>Computer vision</subject><subject>Decision making</subject><subject>Deep learning</subject><subject>Flow nets</subject><subject>Graphics processing units</subject><subject>Human engineering</subject><subject>Human performance</subject><subject>Image segmentation</subject><subject>Machine learning</subject><subject>Network latency</subject><subject>Network topologies</subject><subject>Neural networks</subject><subject>Optical flow (image analysis)</subject><subject>Semantic segmentation</subject><subject>Semantics</subject><subject>Topology</subject><issn>1693-6930</issn><issn>2302-9293</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNjs0KgkAcxJcoSMpnSOiatvvfTPcYYRR9XaSriG21orvlrklv3x56gAaG-cHMYRCaEBwQYBDP0-SwPx9Pu_0qeAOIMICQhFEPOUAx-AwY7SOHLBn1rfEQuVqX2CrCELLYQbOLuHLlaV7n0ojCwr3m0uRGKOl1wjy8SnVelRsui88YDW55pbn7yxGabpJ0vfWfjXq1XJusVG0jbZVRYm-wBYsY_W_1Bc2dO6U</recordid><startdate>20241001</startdate><enddate>20241001</enddate><creator>Gowda, D V Channappa</creator><creator>Kanagavalli, R</creator><general>Ahmad Dahlan University</general><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BVBZV</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope></search><sort><creationdate>20241001</creationdate><title>Video semantic segmentation with low latency</title><author>Gowda, D V Channappa ; Kanagavalli, R</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31157949793</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Autonomous cars</topic><topic>Classification</topic><topic>Computer vision</topic><topic>Decision making</topic><topic>Deep learning</topic><topic>Flow nets</topic><topic>Graphics processing units</topic><topic>Human engineering</topic><topic>Human performance</topic><topic>Image segmentation</topic><topic>Machine learning</topic><topic>Network latency</topic><topic>Network topologies</topic><topic>Neural networks</topic><topic>Optical flow (image analysis)</topic><topic>Semantic segmentation</topic><topic>Semantics</topic><topic>Topology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gowda, D V Channappa</creatorcontrib><creatorcontrib>Kanagavalli, R</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Database‎ (1962 - current)</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>East &amp; South Asia Database</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><jtitle>Telkomnika</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gowda, D V Channappa</au><au>Kanagavalli, R</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Video semantic segmentation with low latency</atitle><jtitle>Telkomnika</jtitle><date>2024-10-01</date><risdate>2024</risdate><volume>22</volume><issue>5</issue><spage>1147</spage><epage>1156</epage><pages>1147-1156</pages><issn>1693-6930</issn><eissn>2302-9293</eissn><abstract>Recent advances in computer vision and deep learning algorithms have yielded intriguing results. It can perform tasks previously requiring human eyes and brains. Semantic video segmentation for autonomous cars is difficult due to the high cost, low latency, and performance requirements of convolutional neural networks (CNNs). Deep learning architectures like SegNet and FlowNet 2.0 on the Cambridge-driving labeled video database (CamVid) dataset enable low-latency pixel-wise semantic segmentation of video features. Because it uses SegNet and FlowNet topologies, it is ideal for practical applications. The decision network chooses an optical flow or segmentation network for an image frame based on the expected confidence score. Combining this decision-making method with adaptive scheduling of the key frame approach can speed up the process. ResNet50 SegNet has a "54.27%" mean intersection over union (MIoU) and a "19.57" average FPS. In addition to decision network and adaptive key frame sequencing, FlowNet2.0 increased graphics processing unit (GPU) frame processing per second to "30.19" with a MIoU of "47.65%". The GPU is used "47.65%" of the time. This performance gain illustrates that the video semantic segmentation network is faster without sacrificing quality.</abstract><cop>Yogyakarta</cop><pub>Ahmad Dahlan University</pub><doi>10.12928/TELKOMNIKA.v22i5.25157</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1693-6930
ispartof Telkomnika, 2024-10, Vol.22 (5), p.1147-1156
issn 1693-6930
2302-9293
language eng
recordid cdi_proquest_journals_3115794979
source Publicly Available Content Database
subjects Algorithms
Artificial neural networks
Autonomous cars
Classification
Computer vision
Decision making
Deep learning
Flow nets
Graphics processing units
Human engineering
Human performance
Image segmentation
Machine learning
Network latency
Network topologies
Neural networks
Optical flow (image analysis)
Semantic segmentation
Semantics
Topology
title Video semantic segmentation with low latency
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T18%3A05%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Video%20semantic%20segmentation%20with%20low%20latency&rft.jtitle=Telkomnika&rft.au=Gowda,%20D%20V%20Channappa&rft.date=2024-10-01&rft.volume=22&rft.issue=5&rft.spage=1147&rft.epage=1156&rft.pages=1147-1156&rft.issn=1693-6930&rft.eissn=2302-9293&rft_id=info:doi/10.12928/TELKOMNIKA.v22i5.25157&rft_dat=%3Cproquest%3E3115794979%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_31157949793%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3115794979&rft_id=info:pmid/&rfr_iscdi=true