Loading…

PixelNN: Example-based Image Synthesis

We present a simple nearest-neighbor (NN) approach that synthesizes high-frequency photorealistic images from an "incomplete" signal such as a low-resolution image, a surface normal map, or edges. Current state-of-the-art deep generative models designed for such conditional image synthesis...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2017-08
Main Authors:	Bansal, Aayush, Sheikh, Yaser, Ramanan, Deva
Format:	Article
Language:	English
Subjects:	Artificial neural networks Bags Collapse Domains Image resolution Neural networks Pipeline design Stability Synthesis
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Bansal, Aayush Sheikh, Yaser Ramanan, Deva
description	We present a simple nearest-neighbor (NN) approach that synthesizes high-frequency photorealistic images from an "incomplete" signal such as a low-resolution image, a surface normal map, or edges. Current state-of-the-art deep generative models designed for such conditional image synthesis lack two important things: (1) they are unable to generate a large set of diverse outputs, due to the mode collapse problem. (2) they are not interpretable, making it difficult to control the synthesized output. We demonstrate that NN approaches potentially address such limitations, but suffer in accuracy on small datasets. We design a simple pipeline that combines the best of both worlds: the first stage uses a convolutional neural network (CNN) to maps the input to a (overly-smoothed) image, and the second stage uses a pixel-wise nearest neighbor method to map the smoothed output to multiple high-quality, high-frequency outputs in a controllable manner. We demonstrate our approach for various input modalities, and for various domains ranging from human faces to cats-and-dogs to shoes and handbags.
format	article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2075726531</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2075726531</sourcerecordid><originalsourceid>FETCH-proquest_journals_20757265313</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRQC8isSM3x87NScK1IzC3ISdVNSixOTVHwzE1MT1UIrswryUgtzizmYWBNS8wpTuWF0twMym6uIc4eugVF-YWlqcUl8Vn5pUV5QKl4IwNzU3MjM1NjQ2PiVAEA-Tcu1A</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2075726531</pqid></control><display><type>article</type><title>PixelNN: Example-based Image Synthesis</title><source>Publicly Available Content (ProQuest)</source><creator>Bansal, Aayush ; Sheikh, Yaser ; Ramanan, Deva</creator><creatorcontrib>Bansal, Aayush ; Sheikh, Yaser ; Ramanan, Deva</creatorcontrib><description>We present a simple nearest-neighbor (NN) approach that synthesizes high-frequency photorealistic images from an "incomplete" signal such as a low-resolution image, a surface normal map, or edges. Current state-of-the-art deep generative models designed for such conditional image synthesis lack two important things: (1) they are unable to generate a large set of diverse outputs, due to the mode collapse problem. (2) they are not interpretable, making it difficult to control the synthesized output. We demonstrate that NN approaches potentially address such limitations, but suffer in accuracy on small datasets. We design a simple pipeline that combines the best of both worlds: the first stage uses a convolutional neural network (CNN) to maps the input to a (overly-smoothed) image, and the second stage uses a pixel-wise nearest neighbor method to map the smoothed output to multiple high-quality, high-frequency outputs in a controllable manner. We demonstrate our approach for various input modalities, and for various domains ranging from human faces to cats-and-dogs to shoes and handbags.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Artificial neural networks ; Bags ; Collapse ; Domains ; Image resolution ; Neural networks ; Pipeline design ; Stability ; Synthesis</subject><ispartof>arXiv.org, 2017-08</ispartof><rights>2017. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2075726531?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25752,37011,44589</link.rule.ids></links><search><creatorcontrib>Bansal, Aayush</creatorcontrib><creatorcontrib>Sheikh, Yaser</creatorcontrib><creatorcontrib>Ramanan, Deva</creatorcontrib><title>PixelNN: Example-based Image Synthesis</title><title>arXiv.org</title><description>We present a simple nearest-neighbor (NN) approach that synthesizes high-frequency photorealistic images from an "incomplete" signal such as a low-resolution image, a surface normal map, or edges. Current state-of-the-art deep generative models designed for such conditional image synthesis lack two important things: (1) they are unable to generate a large set of diverse outputs, due to the mode collapse problem. (2) they are not interpretable, making it difficult to control the synthesized output. We demonstrate that NN approaches potentially address such limitations, but suffer in accuracy on small datasets. We design a simple pipeline that combines the best of both worlds: the first stage uses a convolutional neural network (CNN) to maps the input to a (overly-smoothed) image, and the second stage uses a pixel-wise nearest neighbor method to map the smoothed output to multiple high-quality, high-frequency outputs in a controllable manner. We demonstrate our approach for various input modalities, and for various domains ranging from human faces to cats-and-dogs to shoes and handbags.</description><subject>Artificial neural networks</subject><subject>Bags</subject><subject>Collapse</subject><subject>Domains</subject><subject>Image resolution</subject><subject>Neural networks</subject><subject>Pipeline design</subject><subject>Stability</subject><subject>Synthesis</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2017</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mRQC8isSM3x87NScK1IzC3ISdVNSixOTVHwzE1MT1UIrswryUgtzizmYWBNS8wpTuWF0twMym6uIc4eugVF-YWlqcUl8Vn5pUV5QKl4IwNzU3MjM1NjQ2PiVAEA-Tcu1A</recordid><startdate>20170817</startdate><enddate>20170817</enddate><creator>Bansal, Aayush</creator><creator>Sheikh, Yaser</creator><creator>Ramanan, Deva</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20170817</creationdate><title>PixelNN: Example-based Image Synthesis</title><author>Bansal, Aayush ; Sheikh, Yaser ; Ramanan, Deva</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_20757265313</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2017</creationdate><topic>Artificial neural networks</topic><topic>Bags</topic><topic>Collapse</topic><topic>Domains</topic><topic>Image resolution</topic><topic>Neural networks</topic><topic>Pipeline design</topic><topic>Stability</topic><topic>Synthesis</topic><toplevel>online_resources</toplevel><creatorcontrib>Bansal, Aayush</creatorcontrib><creatorcontrib>Sheikh, Yaser</creatorcontrib><creatorcontrib>Ramanan, Deva</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bansal, Aayush</au><au>Sheikh, Yaser</au><au>Ramanan, Deva</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>PixelNN: Example-based Image Synthesis</atitle><jtitle>arXiv.org</jtitle><date>2017-08-17</date><risdate>2017</risdate><eissn>2331-8422</eissn><abstract>We present a simple nearest-neighbor (NN) approach that synthesizes high-frequency photorealistic images from an "incomplete" signal such as a low-resolution image, a surface normal map, or edges. Current state-of-the-art deep generative models designed for such conditional image synthesis lack two important things: (1) they are unable to generate a large set of diverse outputs, due to the mode collapse problem. (2) they are not interpretable, making it difficult to control the synthesized output. We demonstrate that NN approaches potentially address such limitations, but suffer in accuracy on small datasets. We design a simple pipeline that combines the best of both worlds: the first stage uses a convolutional neural network (CNN) to maps the input to a (overly-smoothed) image, and the second stage uses a pixel-wise nearest neighbor method to map the smoothed output to multiple high-quality, high-frequency outputs in a controllable manner. We demonstrate our approach for various input modalities, and for various domains ranging from human faces to cats-and-dogs to shoes and handbags.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2017-08
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_2075726531
source	Publicly Available Content (ProQuest)
subjects	Artificial neural networks Bags Collapse Domains Image resolution Neural networks Pipeline design Stability Synthesis
title	PixelNN: Example-based Image Synthesis
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T03%3A42%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=PixelNN:%20Example-based%20Image%20Synthesis&rft.jtitle=arXiv.org&rft.au=Bansal,%20Aayush&rft.date=2017-08-17&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2075726531%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_20757265313%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2075726531&rft_id=info:pmid/&rfr_iscdi=true