Loading…

DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision

We have witnessed significant progress in deep learning-based 3D vision, ranging from neural radiance field (NeRF) based 3D representation learning to applications in novel view synthesis (NVS). However, existing scene-level datasets for deep learning-based 3D vision, limited to either synthetic env...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2023-12
Main Authors: Lu, Ling, Sheng, Yichen, Tu, Zhi, Zhao, Wentian, Cheng, Xin, Wan, Kun, Yu, Lantao, Guo, Qianyu, Yu, Zixun, Lu, Yawen, Li, Xuanmao, Sun, Xingpeng, Rohan, Ashok, Mukherjee, Aniruddha, Kang, Hao, Kong, Xiangrui, Hua, Gang, Zhang, Tianyi, Benes, Bedrich, Bera, Aniket
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We have witnessed significant progress in deep learning-based 3D vision, ranging from neural radiance field (NeRF) based 3D representation learning to applications in novel view synthesis (NVS). However, existing scene-level datasets for deep learning-based 3D vision, limited to either synthetic environments or a narrow selection of real-world scenes, are quite insufficient. This insufficiency not only hinders a comprehensive benchmark of existing methods but also caps what could be explored in deep learning-based 3D analysis. To address this critical gap, we present DL3DV-10K, a large-scale scene dataset, featuring 51.2 million frames from 10,510 videos captured from 65 types of point-of-interest (POI) locations, covering both bounded and unbounded scenes, with different levels of reflection, transparency, and lighting. We conducted a comprehensive benchmark of recent NVS methods on DL3DV-10K, which revealed valuable insights for future research in NVS. In addition, we have obtained encouraging results in a pilot study to learn generalizable NeRF from DL3DV-10K, which manifests the necessity of a large-scale scene-level dataset to forge a path toward a foundation model for learning 3D representation. Our DL3DV-10K dataset, benchmark results, and models will be publicly accessible at https://dl3dv-10k.github.io/DL3DV-10K/.
ISSN:2331-8422