Loading…

The Bangkok Urbanscapes Dataset for Semantic Urban Scene Understanding Using Enhanced Encoder-Decoder With Atrous Depthwise Separable A1 Convolutional Neural Networks

Semantic segmentation is one of the computer vision tasks which is widely researched at present. It plays an essential role to adapt and apply for real-world use-cases, including the application with autonomous driving systems. To further study self-driving cars in Thailand, we provide both the prop...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE access 2022, Vol.10, p.59327-59349
Main Authors:	Thitisiriwech, Kitsaphon, Panboonyuen, Teerapong, Kantavat, Pittipol, Iwahori, Yuji, Kijsirikul, Boonserm
Format:	Article
Language:	English
Subjects:	Architecture Artificial neural networks Autonomous cars Coders Computer architecture Computer vision Convolution Datasets Decoding deep convolutional neural networks Driving conditions Encoders-Decoders Feature extraction Image classification Image segmentation Motorcycles Object recognition Scene analysis semantic image segmentation Semantic segmentation Semantics Task analysis Thailand Units of measurement urbanscapes dataset
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Semantic segmentation is one of the computer vision tasks which is widely researched at present. It plays an essential role to adapt and apply for real-world use-cases, including the application with autonomous driving systems. To further study self-driving cars in Thailand, we provide both the proposed methods and the proposed dataset in this paper. In the proposed method, we contribute Deeplab-V3-A1 with Xception, which is an extension of DeepLab-V3+ architecture. Our proposed method as DeepLab-V3-A1 with Xception is enhanced by the different number of 1\times1 convolution layers on the decoder side and refining the image classification backbone with modification of the Xception model. The experiment was conducted on four datasets: the proposed dataset and three public datasets i.e., the CamVid, the cityscapes, and IDD datasets, respectively. The results show that our proposed strategy as DeepLab-V3-A1 with Xception performs comparably to the baseline methods for all corpora including measurement units such as mean IoU, F1 score, Precision, and Recall. In addition, we benchmark DeepLab-V3-A1 with Xception on the validation set of the cityscapes dataset with a mean IoU of 78.86%. For our proposed dataset, we first contribute the Bangkok Urbanscapes dataset, the urban scenes in Southeast Asia. This dataset contains the pair of input images and annotated labels for 701 images. Our dataset consists of various driving environments in Bangkok, as shown for eleven semantic classes (Road, Building, Tree, Car, Footpath, Motorcycle, Pole, Person, Trash, Crosswalk, and Misc). We hope that our architecture and our dataset would help self-driving autonomous developers improve systems for driving in many cities with unique traffic and driving conditions similar to Bangkok and elsewhere in Thailand. Our implementation codes and dataset are available at https://kaopanboonyuen.github.io/bkkurbanscapes .
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2022.3176712