Loading…

PixelNN: Example-based Image Synthesis

We present a simple nearest-neighbor (NN) approach that synthesizes high-frequency photorealistic images from an "incomplete" signal such as a low-resolution image, a surface normal map, or edges. Current state-of-the-art deep generative models designed for such conditional image synthesis...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2017-08
Main Authors: Bansal, Aayush, Sheikh, Yaser, Ramanan, Deva
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Bansal, Aayush
Sheikh, Yaser
Ramanan, Deva
description We present a simple nearest-neighbor (NN) approach that synthesizes high-frequency photorealistic images from an "incomplete" signal such as a low-resolution image, a surface normal map, or edges. Current state-of-the-art deep generative models designed for such conditional image synthesis lack two important things: (1) they are unable to generate a large set of diverse outputs, due to the mode collapse problem. (2) they are not interpretable, making it difficult to control the synthesized output. We demonstrate that NN approaches potentially address such limitations, but suffer in accuracy on small datasets. We design a simple pipeline that combines the best of both worlds: the first stage uses a convolutional neural network (CNN) to maps the input to a (overly-smoothed) image, and the second stage uses a pixel-wise nearest neighbor method to map the smoothed output to multiple high-quality, high-frequency outputs in a controllable manner. We demonstrate our approach for various input modalities, and for various domains ranging from human faces to cats-and-dogs to shoes and handbags.
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2075726531</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2075726531</sourcerecordid><originalsourceid>FETCH-proquest_journals_20757265313</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRQC8isSM3x87NScK1IzC3ISdVNSixOTVHwzE1MT1UIrswryUgtzizmYWBNS8wpTuWF0twMym6uIc4eugVF-YWlqcUl8Vn5pUV5QKl4IwNzU3MjM1NjQ2PiVAEA-Tcu1A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2075726531</pqid></control><display><type>article</type><title>PixelNN: Example-based Image Synthesis</title><source>Publicly Available Content (ProQuest)</source><creator>Bansal, Aayush ; Sheikh, Yaser ; Ramanan, Deva</creator><creatorcontrib>Bansal, Aayush ; Sheikh, Yaser ; Ramanan, Deva</creatorcontrib><description>We present a simple nearest-neighbor (NN) approach that synthesizes high-frequency photorealistic images from an "incomplete" signal such as a low-resolution image, a surface normal map, or edges. Current state-of-the-art deep generative models designed for such conditional image synthesis lack two important things: (1) they are unable to generate a large set of diverse outputs, due to the mode collapse problem. (2) they are not interpretable, making it difficult to control the synthesized output. We demonstrate that NN approaches potentially address such limitations, but suffer in accuracy on small datasets. We design a simple pipeline that combines the best of both worlds: the first stage uses a convolutional neural network (CNN) to maps the input to a (overly-smoothed) image, and the second stage uses a pixel-wise nearest neighbor method to map the smoothed output to multiple high-quality, high-frequency outputs in a controllable manner. We demonstrate our approach for various input modalities, and for various domains ranging from human faces to cats-and-dogs to shoes and handbags.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Artificial neural networks ; Bags ; Collapse ; Domains ; Image resolution ; Neural networks ; Pipeline design ; Stability ; Synthesis</subject><ispartof>arXiv.org, 2017-08</ispartof><rights>2017. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2075726531?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25752,37011,44589</link.rule.ids></links><search><creatorcontrib>Bansal, Aayush</creatorcontrib><creatorcontrib>Sheikh, Yaser</creatorcontrib><creatorcontrib>Ramanan, Deva</creatorcontrib><title>PixelNN: Example-based Image Synthesis</title><title>arXiv.org</title><description>We present a simple nearest-neighbor (NN) approach that synthesizes high-frequency photorealistic images from an "incomplete" signal such as a low-resolution image, a surface normal map, or edges. Current state-of-the-art deep generative models designed for such conditional image synthesis lack two important things: (1) they are unable to generate a large set of diverse outputs, due to the mode collapse problem. (2) they are not interpretable, making it difficult to control the synthesized output. We demonstrate that NN approaches potentially address such limitations, but suffer in accuracy on small datasets. We design a simple pipeline that combines the best of both worlds: the first stage uses a convolutional neural network (CNN) to maps the input to a (overly-smoothed) image, and the second stage uses a pixel-wise nearest neighbor method to map the smoothed output to multiple high-quality, high-frequency outputs in a controllable manner. We demonstrate our approach for various input modalities, and for various domains ranging from human faces to cats-and-dogs to shoes and handbags.</description><subject>Artificial neural networks</subject><subject>Bags</subject><subject>Collapse</subject><subject>Domains</subject><subject>Image resolution</subject><subject>Neural networks</subject><subject>Pipeline design</subject><subject>Stability</subject><subject>Synthesis</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRQC8isSM3x87NScK1IzC3ISdVNSixOTVHwzE1MT1UIrswryUgtzizmYWBNS8wpTuWF0twMym6uIc4eugVF-YWlqcUl8Vn5pUV5QKl4IwNzU3MjM1NjQ2PiVAEA-Tcu1A</recordid><startdate>20170817</startdate><enddate>20170817</enddate><creator>Bansal, Aayush</creator><creator>Sheikh, Yaser</creator><creator>Ramanan, Deva</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20170817</creationdate><title>PixelNN: Example-based Image Synthesis</title><author>Bansal, Aayush ; Sheikh, Yaser ; Ramanan, Deva</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_20757265313</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Artificial neural networks</topic><topic>Bags</topic><topic>Collapse</topic><topic>Domains</topic><topic>Image resolution</topic><topic>Neural networks</topic><topic>Pipeline design</topic><topic>Stability</topic><topic>Synthesis</topic><toplevel>online_resources</toplevel><creatorcontrib>Bansal, Aayush</creatorcontrib><creatorcontrib>Sheikh, Yaser</creatorcontrib><creatorcontrib>Ramanan, Deva</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bansal, Aayush</au><au>Sheikh, Yaser</au><au>Ramanan, Deva</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>PixelNN: Example-based Image Synthesis</atitle><jtitle>arXiv.org</jtitle><date>2017-08-17</date><risdate>2017</risdate><eissn>2331-8422</eissn><abstract>We present a simple nearest-neighbor (NN) approach that synthesizes high-frequency photorealistic images from an "incomplete" signal such as a low-resolution image, a surface normal map, or edges. Current state-of-the-art deep generative models designed for such conditional image synthesis lack two important things: (1) they are unable to generate a large set of diverse outputs, due to the mode collapse problem. (2) they are not interpretable, making it difficult to control the synthesized output. We demonstrate that NN approaches potentially address such limitations, but suffer in accuracy on small datasets. We design a simple pipeline that combines the best of both worlds: the first stage uses a convolutional neural network (CNN) to maps the input to a (overly-smoothed) image, and the second stage uses a pixel-wise nearest neighbor method to map the smoothed output to multiple high-quality, high-frequency outputs in a controllable manner. We demonstrate our approach for various input modalities, and for various domains ranging from human faces to cats-and-dogs to shoes and handbags.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2017-08
issn 2331-8422
language eng
recordid cdi_proquest_journals_2075726531
source Publicly Available Content (ProQuest)
subjects Artificial neural networks
Bags
Collapse
Domains
Image resolution
Neural networks
Pipeline design
Stability
Synthesis
title PixelNN: Example-based Image Synthesis
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T03%3A42%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=PixelNN:%20Example-based%20Image%20Synthesis&rft.jtitle=arXiv.org&rft.au=Bansal,%20Aayush&rft.date=2017-08-17&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2075726531%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_20757265313%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2075726531&rft_id=info:pmid/&rfr_iscdi=true