Loading…

End-to-End Rate-Distortion Optimization for Bi-Directional Learned Video Compression

Conventional video compression methods employ a linear transform and block motion model, and the steps of motion estimation, mode and quantization parameter selection, and entropy coding are optimized individually due to combinatorial nature of the end-to-end optimization problem. Learned video comp...

Full description

Saved in:
Bibliographic Details
Main Authors: Yilmaz, M. Akin, Tekalp, A. Murat
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 1315
container_issue
container_start_page 1311
container_title
container_volume
creator Yilmaz, M. Akin
Tekalp, A. Murat
description Conventional video compression methods employ a linear transform and block motion model, and the steps of motion estimation, mode and quantization parameter selection, and entropy coding are optimized individually due to combinatorial nature of the end-to-end optimization problem. Learned video compression allows end-to-end rate-distortion optimized training of all nonlinear modules, quantization parameter and entropy model simultaneously. While previous work on learned video compression considered training a sequential video codec based on end-to-end optimization of cost averaged over pairs of successive frames, it is well-known in conventional video compression that hierarchical, bi-directional coding outperforms sequential compression. In this paper, we propose for the first time end-to-end optimization of a hierarchical, bi-directional motion compensated learned codec by accumulating cost function over fixed-size groups of pictures (GOP). Experimental results show that the rate-distortion performance of our proposed learned bi-directional GOP coder outperforms the state-of-the-art end-to-end optimized learned sequential compression as expected.
doi_str_mv 10.1109/ICIP40778.2020.9190881
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_9190881</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>9190881</ieee_id><sourcerecordid>9190881</sourcerecordid><originalsourceid>FETCH-LOGICAL-i251t-8559b18bae9cc2d7dfd06b02853b8b077ebde13e95392cfe1685b1883e5b86bf3</originalsourceid><addsrcrecordid>eNotkNFKxDAQRaMguK77BYL0B1IzyaadPGpdtVBYkdXXJWmmENltS9oX_Xqj7tPlzj0MM5exWxA5gDB3dVW_rkVZYi6FFLkBIxDhjK1MiVBKhEIZXZyzhVQIHPXaXLKrafoUiQYFC7bb9J7PA0-SvdmZ-GOY5iHOYeiz7TiHY_i2f6YbYvYQUhyp_R3YQ9aQjT357CN4GrJqOI6Rpill1-yis4eJViddsvenza564c32ua7uGx6khjmdo40DdJZM20pf-s6LwgmJWjl06StynkCR0crItiMoUCceFWmHhevUkt387w1EtB9jONr4tT-VoH4A-UNSGA</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>End-to-End Rate-Distortion Optimization for Bi-Directional Learned Video Compression</title><source>IEEE Xplore All Conference Series</source><creator>Yilmaz, M. Akin ; Tekalp, A. Murat</creator><creatorcontrib>Yilmaz, M. Akin ; Tekalp, A. Murat</creatorcontrib><description>Conventional video compression methods employ a linear transform and block motion model, and the steps of motion estimation, mode and quantization parameter selection, and entropy coding are optimized individually due to combinatorial nature of the end-to-end optimization problem. Learned video compression allows end-to-end rate-distortion optimized training of all nonlinear modules, quantization parameter and entropy model simultaneously. While previous work on learned video compression considered training a sequential video codec based on end-to-end optimization of cost averaged over pairs of successive frames, it is well-known in conventional video compression that hierarchical, bi-directional coding outperforms sequential compression. In this paper, we propose for the first time end-to-end optimization of a hierarchical, bi-directional motion compensated learned codec by accumulating cost function over fixed-size groups of pictures (GOP). Experimental results show that the rate-distortion performance of our proposed learned bi-directional GOP coder outperforms the state-of-the-art end-to-end optimized learned sequential compression as expected.</description><identifier>EISSN: 2381-8549</identifier><identifier>EISBN: 9781728163956</identifier><identifier>EISBN: 1728163951</identifier><identifier>DOI: 10.1109/ICIP40778.2020.9190881</identifier><language>eng</language><publisher>IEEE</publisher><subject>bi-directional motion compensation ; Bidirectional control ; deep learning ; end-to-end optimization ; group of pictures ; Image coding ; Motion compensation ; Optimization ; Quantization (signal) ; Training ; Video compression</subject><ispartof>2020 IEEE International Conference on Image Processing (ICIP), 2020, p.1311-1315</ispartof><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/9190881$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27923,54553,54930</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/9190881$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Yilmaz, M. Akin</creatorcontrib><creatorcontrib>Tekalp, A. Murat</creatorcontrib><title>End-to-End Rate-Distortion Optimization for Bi-Directional Learned Video Compression</title><title>2020 IEEE International Conference on Image Processing (ICIP)</title><addtitle>ICIP</addtitle><description>Conventional video compression methods employ a linear transform and block motion model, and the steps of motion estimation, mode and quantization parameter selection, and entropy coding are optimized individually due to combinatorial nature of the end-to-end optimization problem. Learned video compression allows end-to-end rate-distortion optimized training of all nonlinear modules, quantization parameter and entropy model simultaneously. While previous work on learned video compression considered training a sequential video codec based on end-to-end optimization of cost averaged over pairs of successive frames, it is well-known in conventional video compression that hierarchical, bi-directional coding outperforms sequential compression. In this paper, we propose for the first time end-to-end optimization of a hierarchical, bi-directional motion compensated learned codec by accumulating cost function over fixed-size groups of pictures (GOP). Experimental results show that the rate-distortion performance of our proposed learned bi-directional GOP coder outperforms the state-of-the-art end-to-end optimized learned sequential compression as expected.</description><subject>bi-directional motion compensation</subject><subject>Bidirectional control</subject><subject>deep learning</subject><subject>end-to-end optimization</subject><subject>group of pictures</subject><subject>Image coding</subject><subject>Motion compensation</subject><subject>Optimization</subject><subject>Quantization (signal)</subject><subject>Training</subject><subject>Video compression</subject><issn>2381-8549</issn><isbn>9781728163956</isbn><isbn>1728163951</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2020</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotkNFKxDAQRaMguK77BYL0B1IzyaadPGpdtVBYkdXXJWmmENltS9oX_Xqj7tPlzj0MM5exWxA5gDB3dVW_rkVZYi6FFLkBIxDhjK1MiVBKhEIZXZyzhVQIHPXaXLKrafoUiQYFC7bb9J7PA0-SvdmZ-GOY5iHOYeiz7TiHY_i2f6YbYvYQUhyp_R3YQ9aQjT357CN4GrJqOI6Rpill1-yis4eJViddsvenza564c32ua7uGx6khjmdo40DdJZM20pf-s6LwgmJWjl06StynkCR0crItiMoUCceFWmHhevUkt387w1EtB9jONr4tT-VoH4A-UNSGA</recordid><startdate>20201001</startdate><enddate>20201001</enddate><creator>Yilmaz, M. Akin</creator><creator>Tekalp, A. Murat</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>20201001</creationdate><title>End-to-End Rate-Distortion Optimization for Bi-Directional Learned Video Compression</title><author>Yilmaz, M. Akin ; Tekalp, A. Murat</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i251t-8559b18bae9cc2d7dfd06b02853b8b077ebde13e95392cfe1685b1883e5b86bf3</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2020</creationdate><topic>bi-directional motion compensation</topic><topic>Bidirectional control</topic><topic>deep learning</topic><topic>end-to-end optimization</topic><topic>group of pictures</topic><topic>Image coding</topic><topic>Motion compensation</topic><topic>Optimization</topic><topic>Quantization (signal)</topic><topic>Training</topic><topic>Video compression</topic><toplevel>online_resources</toplevel><creatorcontrib>Yilmaz, M. Akin</creatorcontrib><creatorcontrib>Tekalp, A. Murat</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore Digital Library</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yilmaz, M. Akin</au><au>Tekalp, A. Murat</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>End-to-End Rate-Distortion Optimization for Bi-Directional Learned Video Compression</atitle><btitle>2020 IEEE International Conference on Image Processing (ICIP)</btitle><stitle>ICIP</stitle><date>2020-10-01</date><risdate>2020</risdate><spage>1311</spage><epage>1315</epage><pages>1311-1315</pages><eissn>2381-8549</eissn><eisbn>9781728163956</eisbn><eisbn>1728163951</eisbn><abstract>Conventional video compression methods employ a linear transform and block motion model, and the steps of motion estimation, mode and quantization parameter selection, and entropy coding are optimized individually due to combinatorial nature of the end-to-end optimization problem. Learned video compression allows end-to-end rate-distortion optimized training of all nonlinear modules, quantization parameter and entropy model simultaneously. While previous work on learned video compression considered training a sequential video codec based on end-to-end optimization of cost averaged over pairs of successive frames, it is well-known in conventional video compression that hierarchical, bi-directional coding outperforms sequential compression. In this paper, we propose for the first time end-to-end optimization of a hierarchical, bi-directional motion compensated learned codec by accumulating cost function over fixed-size groups of pictures (GOP). Experimental results show that the rate-distortion performance of our proposed learned bi-directional GOP coder outperforms the state-of-the-art end-to-end optimized learned sequential compression as expected.</abstract><pub>IEEE</pub><doi>10.1109/ICIP40778.2020.9190881</doi><tpages>5</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier EISSN: 2381-8549
ispartof 2020 IEEE International Conference on Image Processing (ICIP), 2020, p.1311-1315
issn 2381-8549
language eng
recordid cdi_ieee_primary_9190881
source IEEE Xplore All Conference Series
subjects bi-directional motion compensation
Bidirectional control
deep learning
end-to-end optimization
group of pictures
Image coding
Motion compensation
Optimization
Quantization (signal)
Training
Video compression
title End-to-End Rate-Distortion Optimization for Bi-Directional Learned Video Compression
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T18%3A05%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=End-to-End%20Rate-Distortion%20Optimization%20for%20Bi-Directional%20Learned%20Video%20Compression&rft.btitle=2020%20IEEE%20International%20Conference%20on%20Image%20Processing%20(ICIP)&rft.au=Yilmaz,%20M.%20Akin&rft.date=2020-10-01&rft.spage=1311&rft.epage=1315&rft.pages=1311-1315&rft.eissn=2381-8549&rft_id=info:doi/10.1109/ICIP40778.2020.9190881&rft.eisbn=9781728163956&rft.eisbn_list=1728163951&rft_dat=%3Cieee_CHZPO%3E9190881%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i251t-8559b18bae9cc2d7dfd06b02853b8b077ebde13e95392cfe1685b1883e5b86bf3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=9190881&rfr_iscdi=true