Loading…

Scene Parsing With Integration of Parametric and Non-Parametric Models

We adopt convolutional neural networks (CNNs) to be our parametric model to learn discriminative features and classifiers for local patch classification. Based on the occurrence frequency distribution of classes, an ensemble of CNNs (CNN-Ensemble) are learned, in which each CNN component focuses on...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on image processing 2016-05, Vol.25 (5), p.2379-2391
Main Authors: Shuai, Bing, Zuo, Zhen, Wang, Gang, Wang, Bing
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We adopt convolutional neural networks (CNNs) to be our parametric model to learn discriminative features and classifiers for local patch classification. Based on the occurrence frequency distribution of classes, an ensemble of CNNs (CNN-Ensemble) are learned, in which each CNN component focuses on learning different and complementary visual patterns. The local beliefs of pixels are output by CNN-Ensemble. Considering that visually similar pixels are indistinguishable under local context, we leverage the global scene semantics to alleviate the local ambiguity. The global scene constraint is mathematically achieved by adding a global energy term to the labeling energy function, and it is practically estimated in a non-parametric framework. A large margin-based CNN metric learning method is also proposed for better global belief estimation. In the end, the integration of local and global beliefs gives rise to the class likelihood of pixels, based on which maximum marginal inference is performed to generate the label prediction maps. Even without any post-processing, we achieve the state-of-the-art results on the challenging SiftFlow and Barcelona benchmarks.
ISSN:1057-7149
1941-0042
DOI:10.1109/TIP.2016.2533862