Loading…

Retro-Remote Sensing: Generating Images From Ancient Texts

The data available in the world come in various modalities, such as audio, text, image, and video. Each data modality has different statistical properties. Understanding each modality, individually, and the relationship between the modalities is vital for a better understanding of the environment su...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE journal of selected topics in applied earth observations and remote sensing 2019-03, Vol.12 (3), p.950-960
Main Authors:	Bejiga, Mesay Belete, Melgani, Farid, Vascotto, Antonio
Format:	Article
Language:	English
Subjects:	Artificial neural networks Audio data Convolutional neural networks (CNN) deep learning Descriptions Detection Distribution Earth Equivalence Gallium nitride generative adversarial networks (GAN) Generators Information processing Learning Mapping multimodal learning Neural networks Remote sensing Satellite imagery Satellites Sensors Statistical methods Synthesis Technological innovation text-to-image synthesis Texts
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The data available in the world come in various modalities, such as audio, text, image, and video. Each data modality has different statistical properties. Understanding each modality, individually, and the relationship between the modalities is vital for a better understanding of the environment surrounding us. Multimodal learning models allow us to process and extract useful information from multimodal sources. For instance, image captioning and text-to-image synthesis are examples of multimodal learning, which require mapping between texts and images. In this paper, we introduce a research area that has never been explored by the remote sensing community, namely the synthesis of remote sensing images from text descriptions. More specifically, in this paper, we focus on exploiting ancient text descriptions of geographical areas, inherited from previous civilizations, to generate equivalent remote sensing images. From a methodological perspective, we propose to rely on generative adversarial networks (GANs) to convert the text descriptions into equivalent pixel values. GANs are a recently proposed class of generative models that formulate learning the distribution of a given dataset as an adversarial competition between two networks. The learned distribution is represented using the weights of a deep neural network and can be used to generate more samples. To fulfill the purpose of this paper, we collected satellite images and ancient texts to train the network. We present the interesting results obtained and propose various future research paths that we believe are important to further develop this new research area.
ISSN:	1939-1404 2151-1535
DOI:	10.1109/JSTARS.2019.2895693