Loading…

Recognition of Indoor Scenes Using 3-D Scene Graphs

Scene recognition is a fundamental task in 3-D scene understanding. It answers the question, "What is this place?" In an indoor environment, the answer can be an office, kitchen, lobby, and so on. As the number of point clouds increases, using embedded point information in scene recognitio...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on geoscience and remote sensing 2024, Vol.62, p.1-16
Main Authors:	Yue, Han, Lehtola, Ville, Wu, Hangbin, Vosselman, George, Li, Jincheng, Liu, Chun
Format:	Article
Language:	English
Subjects:	Accuracy Classification Cloud computing Convolution Datasets Deep learning Feature extraction Graph classification Graphical representations Graphs indoor Indoor environments Instance segmentation Point cloud compression point clouds Recognition Scene analysis scene graphs scene recognition Semantics Structural members Three-dimensional displays
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites	cdi_FETCH-LOGICAL-c246t-73821802d5dcde90d8a44735e527f6bf13ccd116dd3b16c1d46bf673b31800ff3
container_end_page	16
container_issue
container_start_page	1
container_title	IEEE transactions on geoscience and remote sensing
container_volume	62
creator	Yue, Han Lehtola, Ville Wu, Hangbin Vosselman, George Li, Jincheng Liu, Chun
description	Scene recognition is a fundamental task in 3-D scene understanding. It answers the question, "What is this place?" In an indoor environment, the answer can be an office, kitchen, lobby, and so on. As the number of point clouds increases, using embedded point information in scene recognition becomes computationally heavy to process. To achieve computational efficiency and accurate classification, our idea is to use an indoor scene graph that represents the 3-D spatial structures via object instances. The proposed method comprises two parts, namely: 1) construction of indoor scene graphs leveraging object instances and their spatial relationships and 2) classification of these graphs using a deep learning network. Specifically, each indoor scene is represented by a graph, where each node represents either a structural element (like a ceiling, a wall, or a floor) or a piece of furniture (like a chair or a table), and each edge encodes the spatial relationship between these elements. Then, these graphs are used as input for our proposed graph classification network to learn different scene representations. The public indoor dataset, ScanNet v2, with 625.53 million points, is selected to test our method. Experiments yield good results with up to 88.00% accuracy and 82.30% F1 score in the fixed validation dataset and 90.46% accuracy and 81.45% F1 score in the ten-fold cross-validation method; moreover, if some indoor objects cannot be successfully identified, the scene classification accuracy depends sublinearly on the rate of missing objects in the scene.
doi_str_mv	10.1109/TGRS.2024.3387556
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3044631678</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10497127</ieee_id><sourcerecordid>3044631678</sourcerecordid><originalsourceid>FETCH-LOGICAL-c246t-73821802d5dcde90d8a44735e527f6bf13ccd116dd3b16c1d46bf673b31800ff3</originalsourceid><addsrcrecordid>eNpNkEtLAzEUhYMoWKs_QHAx4Hpqbm5es5SqY6Eg9LEO0zzqFJ3UpF34750yXbi6cDjfufARcg90AkCrp1W9WE4YZXyCqJUQ8oKMQAhdUsn5JRlRqGTJdMWuyU3OO0qBC1Ajggtv47ZrD23sihiKWediTMXS-s7nYp3bbltg-TIERZ2a_We-JVeh-cr-7nzHZP32upq-l_OPejZ9npeWcXkoFWoGmjInnHW-ok43nCsUXjAV5CYAWusApHO4AWnB8T6UCjfYUzQEHJPHYXef4s_R54PZxWPq-pcGKecSQSrdt2Bo2RRzTj6YfWq_m_RrgJqTG3NyY05uzNlNzzwMTOu9_9fnlQKm8A_aEl3H</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3044631678</pqid></control><display><type>article</type><title>Recognition of Indoor Scenes Using 3-D Scene Graphs</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Yue, Han ; Lehtola, Ville ; Wu, Hangbin ; Vosselman, George ; Li, Jincheng ; Liu, Chun</creator><creatorcontrib>Yue, Han ; Lehtola, Ville ; Wu, Hangbin ; Vosselman, George ; Li, Jincheng ; Liu, Chun</creatorcontrib><description>Scene recognition is a fundamental task in 3-D scene understanding. It answers the question, "What is this place?" In an indoor environment, the answer can be an office, kitchen, lobby, and so on. As the number of point clouds increases, using embedded point information in scene recognition becomes computationally heavy to process. To achieve computational efficiency and accurate classification, our idea is to use an indoor scene graph that represents the 3-D spatial structures via object instances. The proposed method comprises two parts, namely: 1) construction of indoor scene graphs leveraging object instances and their spatial relationships and 2) classification of these graphs using a deep learning network. Specifically, each indoor scene is represented by a graph, where each node represents either a structural element (like a ceiling, a wall, or a floor) or a piece of furniture (like a chair or a table), and each edge encodes the spatial relationship between these elements. Then, these graphs are used as input for our proposed graph classification network to learn different scene representations. The public indoor dataset, ScanNet v2, with 625.53 million points, is selected to test our method. Experiments yield good results with up to 88.00% accuracy and 82.30% F1 score in the fixed validation dataset and 90.46% accuracy and 81.45% F1 score in the ten-fold cross-validation method; moreover, if some indoor objects cannot be successfully identified, the scene classification accuracy depends sublinearly on the rate of missing objects in the scene.</description><identifier>ISSN: 0196-2892</identifier><identifier>EISSN: 1558-0644</identifier><identifier>DOI: 10.1109/TGRS.2024.3387556</identifier><identifier>CODEN: IGRSD2</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Accuracy ; Classification ; Cloud computing ; Convolution ; Datasets ; Deep learning ; Feature extraction ; Graph classification ; Graphical representations ; Graphs ; indoor ; Indoor environments ; Instance segmentation ; Point cloud compression ; point clouds ; Recognition ; Scene analysis ; scene graphs ; scene recognition ; Semantics ; Structural members ; Three-dimensional displays</subject><ispartof>IEEE transactions on geoscience and remote sensing, 2024, Vol.62, p.1-16</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c246t-73821802d5dcde90d8a44735e527f6bf13ccd116dd3b16c1d46bf673b31800ff3</cites><orcidid>0000-0001-9319-1640 ; 0000-0001-8856-5919 ; 0000-0001-8813-8028 ; 0000-0002-4985-191X ; 0000-0002-6200-9065</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10497127$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,4010,27900,27901,27902,54771</link.rule.ids></links><search><creatorcontrib>Yue, Han</creatorcontrib><creatorcontrib>Lehtola, Ville</creatorcontrib><creatorcontrib>Wu, Hangbin</creatorcontrib><creatorcontrib>Vosselman, George</creatorcontrib><creatorcontrib>Li, Jincheng</creatorcontrib><creatorcontrib>Liu, Chun</creatorcontrib><title>Recognition of Indoor Scenes Using 3-D Scene Graphs</title><title>IEEE transactions on geoscience and remote sensing</title><addtitle>TGRS</addtitle><description>Scene recognition is a fundamental task in 3-D scene understanding. It answers the question, "What is this place?" In an indoor environment, the answer can be an office, kitchen, lobby, and so on. As the number of point clouds increases, using embedded point information in scene recognition becomes computationally heavy to process. To achieve computational efficiency and accurate classification, our idea is to use an indoor scene graph that represents the 3-D spatial structures via object instances. The proposed method comprises two parts, namely: 1) construction of indoor scene graphs leveraging object instances and their spatial relationships and 2) classification of these graphs using a deep learning network. Specifically, each indoor scene is represented by a graph, where each node represents either a structural element (like a ceiling, a wall, or a floor) or a piece of furniture (like a chair or a table), and each edge encodes the spatial relationship between these elements. Then, these graphs are used as input for our proposed graph classification network to learn different scene representations. The public indoor dataset, ScanNet v2, with 625.53 million points, is selected to test our method. Experiments yield good results with up to 88.00% accuracy and 82.30% F1 score in the fixed validation dataset and 90.46% accuracy and 81.45% F1 score in the ten-fold cross-validation method; moreover, if some indoor objects cannot be successfully identified, the scene classification accuracy depends sublinearly on the rate of missing objects in the scene.</description><subject>Accuracy</subject><subject>Classification</subject><subject>Cloud computing</subject><subject>Convolution</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Feature extraction</subject><subject>Graph classification</subject><subject>Graphical representations</subject><subject>Graphs</subject><subject>indoor</subject><subject>Indoor environments</subject><subject>Instance segmentation</subject><subject>Point cloud compression</subject><subject>point clouds</subject><subject>Recognition</subject><subject>Scene analysis</subject><subject>scene graphs</subject><subject>scene recognition</subject><subject>Semantics</subject><subject>Structural members</subject><subject>Three-dimensional displays</subject><issn>0196-2892</issn><issn>1558-0644</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpNkEtLAzEUhYMoWKs_QHAx4Hpqbm5es5SqY6Eg9LEO0zzqFJ3UpF34750yXbi6cDjfufARcg90AkCrp1W9WE4YZXyCqJUQ8oKMQAhdUsn5JRlRqGTJdMWuyU3OO0qBC1Ajggtv47ZrD23sihiKWediTMXS-s7nYp3bbltg-TIERZ2a_We-JVeh-cr-7nzHZP32upq-l_OPejZ9npeWcXkoFWoGmjInnHW-ok43nCsUXjAV5CYAWusApHO4AWnB8T6UCjfYUzQEHJPHYXef4s_R54PZxWPq-pcGKecSQSrdt2Bo2RRzTj6YfWq_m_RrgJqTG3NyY05uzNlNzzwMTOu9_9fnlQKm8A_aEl3H</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Yue, Han</creator><creator>Lehtola, Ville</creator><creator>Wu, Hangbin</creator><creator>Vosselman, George</creator><creator>Li, Jincheng</creator><creator>Liu, Chun</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7UA</scope><scope>8FD</scope><scope>C1K</scope><scope>F1W</scope><scope>FR3</scope><scope>H8D</scope><scope>H96</scope><scope>KR7</scope><scope>L.G</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0001-9319-1640</orcidid><orcidid>https://orcid.org/0000-0001-8856-5919</orcidid><orcidid>https://orcid.org/0000-0001-8813-8028</orcidid><orcidid>https://orcid.org/0000-0002-4985-191X</orcidid><orcidid>https://orcid.org/0000-0002-6200-9065</orcidid></search><sort><creationdate>2024</creationdate><title>Recognition of Indoor Scenes Using 3-D Scene Graphs</title><author>Yue, Han ; Lehtola, Ville ; Wu, Hangbin ; Vosselman, George ; Li, Jincheng ; Liu, Chun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c246t-73821802d5dcde90d8a44735e527f6bf13ccd116dd3b16c1d46bf673b31800ff3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Classification</topic><topic>Cloud computing</topic><topic>Convolution</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Feature extraction</topic><topic>Graph classification</topic><topic>Graphical representations</topic><topic>Graphs</topic><topic>indoor</topic><topic>Indoor environments</topic><topic>Instance segmentation</topic><topic>Point cloud compression</topic><topic>point clouds</topic><topic>Recognition</topic><topic>Scene analysis</topic><topic>scene graphs</topic><topic>scene recognition</topic><topic>Semantics</topic><topic>Structural members</topic><topic>Three-dimensional displays</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yue, Han</creatorcontrib><creatorcontrib>Lehtola, Ville</creatorcontrib><creatorcontrib>Wu, Hangbin</creatorcontrib><creatorcontrib>Vosselman, George</creatorcontrib><creatorcontrib>Li, Jincheng</creatorcontrib><creatorcontrib>Liu, Chun</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEL</collection><collection>CrossRef</collection><collection>Water Resources Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy & Non-Living Resources</collection><collection>Civil Engineering Abstracts</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) Professional</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on geoscience and remote sensing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yue, Han</au><au>Lehtola, Ville</au><au>Wu, Hangbin</au><au>Vosselman, George</au><au>Li, Jincheng</au><au>Liu, Chun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Recognition of Indoor Scenes Using 3-D Scene Graphs</atitle><jtitle>IEEE transactions on geoscience and remote sensing</jtitle><stitle>TGRS</stitle><date>2024</date><risdate>2024</risdate><volume>62</volume><spage>1</spage><epage>16</epage><pages>1-16</pages><issn>0196-2892</issn><eissn>1558-0644</eissn><coden>IGRSD2</coden><abstract>Scene recognition is a fundamental task in 3-D scene understanding. It answers the question, "What is this place?" In an indoor environment, the answer can be an office, kitchen, lobby, and so on. As the number of point clouds increases, using embedded point information in scene recognition becomes computationally heavy to process. To achieve computational efficiency and accurate classification, our idea is to use an indoor scene graph that represents the 3-D spatial structures via object instances. The proposed method comprises two parts, namely: 1) construction of indoor scene graphs leveraging object instances and their spatial relationships and 2) classification of these graphs using a deep learning network. Specifically, each indoor scene is represented by a graph, where each node represents either a structural element (like a ceiling, a wall, or a floor) or a piece of furniture (like a chair or a table), and each edge encodes the spatial relationship between these elements. Then, these graphs are used as input for our proposed graph classification network to learn different scene representations. The public indoor dataset, ScanNet v2, with 625.53 million points, is selected to test our method. Experiments yield good results with up to 88.00% accuracy and 82.30% F1 score in the fixed validation dataset and 90.46% accuracy and 81.45% F1 score in the ten-fold cross-validation method; moreover, if some indoor objects cannot be successfully identified, the scene classification accuracy depends sublinearly on the rate of missing objects in the scene.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TGRS.2024.3387556</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0001-9319-1640</orcidid><orcidid>https://orcid.org/0000-0001-8856-5919</orcidid><orcidid>https://orcid.org/0000-0001-8813-8028</orcidid><orcidid>https://orcid.org/0000-0002-4985-191X</orcidid><orcidid>https://orcid.org/0000-0002-6200-9065</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0196-2892
ispartof	IEEE transactions on geoscience and remote sensing, 2024, Vol.62, p.1-16
issn	0196-2892 1558-0644
language	eng
recordid	cdi_proquest_journals_3044631678
source	IEEE Electronic Library (IEL) Journals
subjects	Accuracy Classification Cloud computing Convolution Datasets Deep learning Feature extraction Graph classification Graphical representations Graphs indoor Indoor environments Instance segmentation Point cloud compression point clouds Recognition Scene analysis scene graphs scene recognition Semantics Structural members Three-dimensional displays
title	Recognition of Indoor Scenes Using 3-D Scene Graphs
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T10%3A00%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Recognition%20of%20Indoor%20Scenes%20Using%203-D%20Scene%20Graphs&rft.jtitle=IEEE%20transactions%20on%20geoscience%20and%20remote%20sensing&rft.au=Yue,%20Han&rft.date=2024&rft.volume=62&rft.spage=1&rft.epage=16&rft.pages=1-16&rft.issn=0196-2892&rft.eissn=1558-0644&rft.coden=IGRSD2&rft_id=info:doi/10.1109/TGRS.2024.3387556&rft_dat=%3Cproquest_cross%3E3044631678%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c246t-73821802d5dcde90d8a44735e527f6bf13ccd116dd3b16c1d46bf673b31800ff3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3044631678&rft_id=info:pmid/&rft_ieee_id=10497127&rfr_iscdi=true