Loading…
Video semantic segmentation with low latency
Recent advances in computer vision and deep learning algorithms have yielded intriguing results. It can perform tasks previously requiring human eyes and brains. Semantic video segmentation for autonomous cars is difficult due to the high cost, low latency, and performance requirements of convolutio...
Saved in:
Published in: | Telkomnika 2024-10, Vol.22 (5), p.1147-1156 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 1156 |
container_issue | 5 |
container_start_page | 1147 |
container_title | Telkomnika |
container_volume | 22 |
creator | Gowda, D V Channappa Kanagavalli, R |
description | Recent advances in computer vision and deep learning algorithms have yielded intriguing results. It can perform tasks previously requiring human eyes and brains. Semantic video segmentation for autonomous cars is difficult due to the high cost, low latency, and performance requirements of convolutional neural networks (CNNs). Deep learning architectures like SegNet and FlowNet 2.0 on the Cambridge-driving labeled video database (CamVid) dataset enable low-latency pixel-wise semantic segmentation of video features. Because it uses SegNet and FlowNet topologies, it is ideal for practical applications. The decision network chooses an optical flow or segmentation network for an image frame based on the expected confidence score. Combining this decision-making method with adaptive scheduling of the key frame approach can speed up the process. ResNet50 SegNet has a "54.27%" mean intersection over union (MIoU) and a "19.57" average FPS. In addition to decision network and adaptive key frame sequencing, FlowNet2.0 increased graphics processing unit (GPU) frame processing per second to "30.19" with a MIoU of "47.65%". The GPU is used "47.65%" of the time. This performance gain illustrates that the video semantic segmentation network is faster without sacrificing quality. |
doi_str_mv | 10.12928/TELKOMNIKA.v22i5.25157 |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3115794979</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3115794979</sourcerecordid><originalsourceid>FETCH-proquest_journals_31157949793</originalsourceid><addsrcrecordid>eNqNjs0KgkAcxJcoSMpnSOiatvvfTPcYYRR9XaSriG21orvlrklv3x56gAaG-cHMYRCaEBwQYBDP0-SwPx9Pu_0qeAOIMICQhFEPOUAx-AwY7SOHLBn1rfEQuVqX2CrCELLYQbOLuHLlaV7n0ojCwr3m0uRGKOl1wjy8SnVelRsui88YDW55pbn7yxGabpJ0vfWfjXq1XJusVG0jbZVRYm-wBYsY_W_1Bc2dO6U</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3115794979</pqid></control><display><type>article</type><title>Video semantic segmentation with low latency</title><source>Publicly Available Content Database</source><creator>Gowda, D V Channappa ; Kanagavalli, R</creator><creatorcontrib>Gowda, D V Channappa ; Kanagavalli, R</creatorcontrib><description>Recent advances in computer vision and deep learning algorithms have yielded intriguing results. It can perform tasks previously requiring human eyes and brains. Semantic video segmentation for autonomous cars is difficult due to the high cost, low latency, and performance requirements of convolutional neural networks (CNNs). Deep learning architectures like SegNet and FlowNet 2.0 on the Cambridge-driving labeled video database (CamVid) dataset enable low-latency pixel-wise semantic segmentation of video features. Because it uses SegNet and FlowNet topologies, it is ideal for practical applications. The decision network chooses an optical flow or segmentation network for an image frame based on the expected confidence score. Combining this decision-making method with adaptive scheduling of the key frame approach can speed up the process. ResNet50 SegNet has a "54.27%" mean intersection over union (MIoU) and a "19.57" average FPS. In addition to decision network and adaptive key frame sequencing, FlowNet2.0 increased graphics processing unit (GPU) frame processing per second to "30.19" with a MIoU of "47.65%". The GPU is used "47.65%" of the time. This performance gain illustrates that the video semantic segmentation network is faster without sacrificing quality.</description><identifier>ISSN: 1693-6930</identifier><identifier>EISSN: 2302-9293</identifier><identifier>DOI: 10.12928/TELKOMNIKA.v22i5.25157</identifier><language>eng</language><publisher>Yogyakarta: Ahmad Dahlan University</publisher><subject>Algorithms ; Artificial neural networks ; Autonomous cars ; Classification ; Computer vision ; Decision making ; Deep learning ; Flow nets ; Graphics processing units ; Human engineering ; Human performance ; Image segmentation ; Machine learning ; Network latency ; Network topologies ; Neural networks ; Optical flow (image analysis) ; Semantic segmentation ; Semantics ; Topology</subject><ispartof>Telkomnika, 2024-10, Vol.22 (5), p.1147-1156</ispartof><rights>2024. This work is published under https://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/3115794979/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/3115794979?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,25753,27924,27925,37012,44590,75126</link.rule.ids></links><search><creatorcontrib>Gowda, D V Channappa</creatorcontrib><creatorcontrib>Kanagavalli, R</creatorcontrib><title>Video semantic segmentation with low latency</title><title>Telkomnika</title><description>Recent advances in computer vision and deep learning algorithms have yielded intriguing results. It can perform tasks previously requiring human eyes and brains. Semantic video segmentation for autonomous cars is difficult due to the high cost, low latency, and performance requirements of convolutional neural networks (CNNs). Deep learning architectures like SegNet and FlowNet 2.0 on the Cambridge-driving labeled video database (CamVid) dataset enable low-latency pixel-wise semantic segmentation of video features. Because it uses SegNet and FlowNet topologies, it is ideal for practical applications. The decision network chooses an optical flow or segmentation network for an image frame based on the expected confidence score. Combining this decision-making method with adaptive scheduling of the key frame approach can speed up the process. ResNet50 SegNet has a "54.27%" mean intersection over union (MIoU) and a "19.57" average FPS. In addition to decision network and adaptive key frame sequencing, FlowNet2.0 increased graphics processing unit (GPU) frame processing per second to "30.19" with a MIoU of "47.65%". The GPU is used "47.65%" of the time. This performance gain illustrates that the video semantic segmentation network is faster without sacrificing quality.</description><subject>Algorithms</subject><subject>Artificial neural networks</subject><subject>Autonomous cars</subject><subject>Classification</subject><subject>Computer vision</subject><subject>Decision making</subject><subject>Deep learning</subject><subject>Flow nets</subject><subject>Graphics processing units</subject><subject>Human engineering</subject><subject>Human performance</subject><subject>Image segmentation</subject><subject>Machine learning</subject><subject>Network latency</subject><subject>Network topologies</subject><subject>Neural networks</subject><subject>Optical flow (image analysis)</subject><subject>Semantic segmentation</subject><subject>Semantics</subject><subject>Topology</subject><issn>1693-6930</issn><issn>2302-9293</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNjs0KgkAcxJcoSMpnSOiatvvfTPcYYRR9XaSriG21orvlrklv3x56gAaG-cHMYRCaEBwQYBDP0-SwPx9Pu_0qeAOIMICQhFEPOUAx-AwY7SOHLBn1rfEQuVqX2CrCELLYQbOLuHLlaV7n0ojCwr3m0uRGKOl1wjy8SnVelRsui88YDW55pbn7yxGabpJ0vfWfjXq1XJusVG0jbZVRYm-wBYsY_W_1Bc2dO6U</recordid><startdate>20241001</startdate><enddate>20241001</enddate><creator>Gowda, D V Channappa</creator><creator>Kanagavalli, R</creator><general>Ahmad Dahlan University</general><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BVBZV</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope></search><sort><creationdate>20241001</creationdate><title>Video semantic segmentation with low latency</title><author>Gowda, D V Channappa ; Kanagavalli, R</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31157949793</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Artificial neural networks</topic><topic>Autonomous cars</topic><topic>Classification</topic><topic>Computer vision</topic><topic>Decision making</topic><topic>Deep learning</topic><topic>Flow nets</topic><topic>Graphics processing units</topic><topic>Human engineering</topic><topic>Human performance</topic><topic>Image segmentation</topic><topic>Machine learning</topic><topic>Network latency</topic><topic>Network topologies</topic><topic>Neural networks</topic><topic>Optical flow (image analysis)</topic><topic>Semantic segmentation</topic><topic>Semantics</topic><topic>Topology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gowda, D V Channappa</creatorcontrib><creatorcontrib>Kanagavalli, R</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Database (1962 - current)</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>East & South Asia Database</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><jtitle>Telkomnika</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gowda, D V Channappa</au><au>Kanagavalli, R</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Video semantic segmentation with low latency</atitle><jtitle>Telkomnika</jtitle><date>2024-10-01</date><risdate>2024</risdate><volume>22</volume><issue>5</issue><spage>1147</spage><epage>1156</epage><pages>1147-1156</pages><issn>1693-6930</issn><eissn>2302-9293</eissn><abstract>Recent advances in computer vision and deep learning algorithms have yielded intriguing results. It can perform tasks previously requiring human eyes and brains. Semantic video segmentation for autonomous cars is difficult due to the high cost, low latency, and performance requirements of convolutional neural networks (CNNs). Deep learning architectures like SegNet and FlowNet 2.0 on the Cambridge-driving labeled video database (CamVid) dataset enable low-latency pixel-wise semantic segmentation of video features. Because it uses SegNet and FlowNet topologies, it is ideal for practical applications. The decision network chooses an optical flow or segmentation network for an image frame based on the expected confidence score. Combining this decision-making method with adaptive scheduling of the key frame approach can speed up the process. ResNet50 SegNet has a "54.27%" mean intersection over union (MIoU) and a "19.57" average FPS. In addition to decision network and adaptive key frame sequencing, FlowNet2.0 increased graphics processing unit (GPU) frame processing per second to "30.19" with a MIoU of "47.65%". The GPU is used "47.65%" of the time. This performance gain illustrates that the video semantic segmentation network is faster without sacrificing quality.</abstract><cop>Yogyakarta</cop><pub>Ahmad Dahlan University</pub><doi>10.12928/TELKOMNIKA.v22i5.25157</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1693-6930 |
ispartof | Telkomnika, 2024-10, Vol.22 (5), p.1147-1156 |
issn | 1693-6930 2302-9293 |
language | eng |
recordid | cdi_proquest_journals_3115794979 |
source | Publicly Available Content Database |
subjects | Algorithms Artificial neural networks Autonomous cars Classification Computer vision Decision making Deep learning Flow nets Graphics processing units Human engineering Human performance Image segmentation Machine learning Network latency Network topologies Neural networks Optical flow (image analysis) Semantic segmentation Semantics Topology |
title | Video semantic segmentation with low latency |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T18%3A05%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Video%20semantic%20segmentation%20with%20low%20latency&rft.jtitle=Telkomnika&rft.au=Gowda,%20D%20V%20Channappa&rft.date=2024-10-01&rft.volume=22&rft.issue=5&rft.spage=1147&rft.epage=1156&rft.pages=1147-1156&rft.issn=1693-6930&rft.eissn=2302-9293&rft_id=info:doi/10.12928/TELKOMNIKA.v22i5.25157&rft_dat=%3Cproquest%3E3115794979%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_31157949793%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3115794979&rft_id=info:pmid/&rfr_iscdi=true |