Loading…

Recognition of Indoor Scenes Using 3-D Scene Graphs

Scene recognition is a fundamental task in 3-D scene understanding. It answers the question, "What is this place?" In an indoor environment, the answer can be an office, kitchen, lobby, and so on. As the number of point clouds increases, using embedded point information in scene recognitio...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on geoscience and remote sensing 2024, Vol.62, p.1-16
Main Authors: Yue, Han, Lehtola, Ville, Wu, Hangbin, Vosselman, George, Li, Jincheng, Liu, Chun
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-c246t-73821802d5dcde90d8a44735e527f6bf13ccd116dd3b16c1d46bf673b31800ff3
container_end_page 16
container_issue
container_start_page 1
container_title IEEE transactions on geoscience and remote sensing
container_volume 62
creator Yue, Han
Lehtola, Ville
Wu, Hangbin
Vosselman, George
Li, Jincheng
Liu, Chun
description Scene recognition is a fundamental task in 3-D scene understanding. It answers the question, "What is this place?" In an indoor environment, the answer can be an office, kitchen, lobby, and so on. As the number of point clouds increases, using embedded point information in scene recognition becomes computationally heavy to process. To achieve computational efficiency and accurate classification, our idea is to use an indoor scene graph that represents the 3-D spatial structures via object instances. The proposed method comprises two parts, namely: 1) construction of indoor scene graphs leveraging object instances and their spatial relationships and 2) classification of these graphs using a deep learning network. Specifically, each indoor scene is represented by a graph, where each node represents either a structural element (like a ceiling, a wall, or a floor) or a piece of furniture (like a chair or a table), and each edge encodes the spatial relationship between these elements. Then, these graphs are used as input for our proposed graph classification network to learn different scene representations. The public indoor dataset, ScanNet v2, with 625.53 million points, is selected to test our method. Experiments yield good results with up to 88.00% accuracy and 82.30% F1 score in the fixed validation dataset and 90.46% accuracy and 81.45% F1 score in the ten-fold cross-validation method; moreover, if some indoor objects cannot be successfully identified, the scene classification accuracy depends sublinearly on the rate of missing objects in the scene.
doi_str_mv 10.1109/TGRS.2024.3387556
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3044631678</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10497127</ieee_id><sourcerecordid>3044631678</sourcerecordid><originalsourceid>FETCH-LOGICAL-c246t-73821802d5dcde90d8a44735e527f6bf13ccd116dd3b16c1d46bf673b31800ff3</originalsourceid><addsrcrecordid>eNpNkEtLAzEUhYMoWKs_QHAx4Hpqbm5es5SqY6Eg9LEO0zzqFJ3UpF34750yXbi6cDjfufARcg90AkCrp1W9WE4YZXyCqJUQ8oKMQAhdUsn5JRlRqGTJdMWuyU3OO0qBC1Ajggtv47ZrD23sihiKWediTMXS-s7nYp3bbltg-TIERZ2a_We-JVeh-cr-7nzHZP32upq-l_OPejZ9npeWcXkoFWoGmjInnHW-ok43nCsUXjAV5CYAWusApHO4AWnB8T6UCjfYUzQEHJPHYXef4s_R54PZxWPq-pcGKecSQSrdt2Bo2RRzTj6YfWq_m_RrgJqTG3NyY05uzNlNzzwMTOu9_9fnlQKm8A_aEl3H</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3044631678</pqid></control><display><type>article</type><title>Recognition of Indoor Scenes Using 3-D Scene Graphs</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Yue, Han ; Lehtola, Ville ; Wu, Hangbin ; Vosselman, George ; Li, Jincheng ; Liu, Chun</creator><creatorcontrib>Yue, Han ; Lehtola, Ville ; Wu, Hangbin ; Vosselman, George ; Li, Jincheng ; Liu, Chun</creatorcontrib><description>Scene recognition is a fundamental task in 3-D scene understanding. It answers the question, "What is this place?" In an indoor environment, the answer can be an office, kitchen, lobby, and so on. As the number of point clouds increases, using embedded point information in scene recognition becomes computationally heavy to process. To achieve computational efficiency and accurate classification, our idea is to use an indoor scene graph that represents the 3-D spatial structures via object instances. The proposed method comprises two parts, namely: 1) construction of indoor scene graphs leveraging object instances and their spatial relationships and 2) classification of these graphs using a deep learning network. Specifically, each indoor scene is represented by a graph, where each node represents either a structural element (like a ceiling, a wall, or a floor) or a piece of furniture (like a chair or a table), and each edge encodes the spatial relationship between these elements. Then, these graphs are used as input for our proposed graph classification network to learn different scene representations. The public indoor dataset, ScanNet v2, with 625.53 million points, is selected to test our method. Experiments yield good results with up to 88.00% accuracy and 82.30% F1 score in the fixed validation dataset and 90.46% accuracy and 81.45% F1 score in the ten-fold cross-validation method; moreover, if some indoor objects cannot be successfully identified, the scene classification accuracy depends sublinearly on the rate of missing objects in the scene.</description><identifier>ISSN: 0196-2892</identifier><identifier>EISSN: 1558-0644</identifier><identifier>DOI: 10.1109/TGRS.2024.3387556</identifier><identifier>CODEN: IGRSD2</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Accuracy ; Classification ; Cloud computing ; Convolution ; Datasets ; Deep learning ; Feature extraction ; Graph classification ; Graphical representations ; Graphs ; indoor ; Indoor environments ; Instance segmentation ; Point cloud compression ; point clouds ; Recognition ; Scene analysis ; scene graphs ; scene recognition ; Semantics ; Structural members ; Three-dimensional displays</subject><ispartof>IEEE transactions on geoscience and remote sensing, 2024, Vol.62, p.1-16</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c246t-73821802d5dcde90d8a44735e527f6bf13ccd116dd3b16c1d46bf673b31800ff3</cites><orcidid>0000-0001-9319-1640 ; 0000-0001-8856-5919 ; 0000-0001-8813-8028 ; 0000-0002-4985-191X ; 0000-0002-6200-9065</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10497127$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,4010,27900,27901,27902,54771</link.rule.ids></links><search><creatorcontrib>Yue, Han</creatorcontrib><creatorcontrib>Lehtola, Ville</creatorcontrib><creatorcontrib>Wu, Hangbin</creatorcontrib><creatorcontrib>Vosselman, George</creatorcontrib><creatorcontrib>Li, Jincheng</creatorcontrib><creatorcontrib>Liu, Chun</creatorcontrib><title>Recognition of Indoor Scenes Using 3-D Scene Graphs</title><title>IEEE transactions on geoscience and remote sensing</title><addtitle>TGRS</addtitle><description>Scene recognition is a fundamental task in 3-D scene understanding. It answers the question, "What is this place?" In an indoor environment, the answer can be an office, kitchen, lobby, and so on. As the number of point clouds increases, using embedded point information in scene recognition becomes computationally heavy to process. To achieve computational efficiency and accurate classification, our idea is to use an indoor scene graph that represents the 3-D spatial structures via object instances. The proposed method comprises two parts, namely: 1) construction of indoor scene graphs leveraging object instances and their spatial relationships and 2) classification of these graphs using a deep learning network. Specifically, each indoor scene is represented by a graph, where each node represents either a structural element (like a ceiling, a wall, or a floor) or a piece of furniture (like a chair or a table), and each edge encodes the spatial relationship between these elements. Then, these graphs are used as input for our proposed graph classification network to learn different scene representations. The public indoor dataset, ScanNet v2, with 625.53 million points, is selected to test our method. Experiments yield good results with up to 88.00% accuracy and 82.30% F1 score in the fixed validation dataset and 90.46% accuracy and 81.45% F1 score in the ten-fold cross-validation method; moreover, if some indoor objects cannot be successfully identified, the scene classification accuracy depends sublinearly on the rate of missing objects in the scene.</description><subject>Accuracy</subject><subject>Classification</subject><subject>Cloud computing</subject><subject>Convolution</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Feature extraction</subject><subject>Graph classification</subject><subject>Graphical representations</subject><subject>Graphs</subject><subject>indoor</subject><subject>Indoor environments</subject><subject>Instance segmentation</subject><subject>Point cloud compression</subject><subject>point clouds</subject><subject>Recognition</subject><subject>Scene analysis</subject><subject>scene graphs</subject><subject>scene recognition</subject><subject>Semantics</subject><subject>Structural members</subject><subject>Three-dimensional displays</subject><issn>0196-2892</issn><issn>1558-0644</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpNkEtLAzEUhYMoWKs_QHAx4Hpqbm5es5SqY6Eg9LEO0zzqFJ3UpF34750yXbi6cDjfufARcg90AkCrp1W9WE4YZXyCqJUQ8oKMQAhdUsn5JRlRqGTJdMWuyU3OO0qBC1Ajggtv47ZrD23sihiKWediTMXS-s7nYp3bbltg-TIERZ2a_We-JVeh-cr-7nzHZP32upq-l_OPejZ9npeWcXkoFWoGmjInnHW-ok43nCsUXjAV5CYAWusApHO4AWnB8T6UCjfYUzQEHJPHYXef4s_R54PZxWPq-pcGKecSQSrdt2Bo2RRzTj6YfWq_m_RrgJqTG3NyY05uzNlNzzwMTOu9_9fnlQKm8A_aEl3H</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Yue, Han</creator><creator>Lehtola, Ville</creator><creator>Wu, Hangbin</creator><creator>Vosselman, George</creator><creator>Li, Jincheng</creator><creator>Liu, Chun</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7UA</scope><scope>8FD</scope><scope>C1K</scope><scope>F1W</scope><scope>FR3</scope><scope>H8D</scope><scope>H96</scope><scope>KR7</scope><scope>L.G</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0001-9319-1640</orcidid><orcidid>https://orcid.org/0000-0001-8856-5919</orcidid><orcidid>https://orcid.org/0000-0001-8813-8028</orcidid><orcidid>https://orcid.org/0000-0002-4985-191X</orcidid><orcidid>https://orcid.org/0000-0002-6200-9065</orcidid></search><sort><creationdate>2024</creationdate><title>Recognition of Indoor Scenes Using 3-D Scene Graphs</title><author>Yue, Han ; Lehtola, Ville ; Wu, Hangbin ; Vosselman, George ; Li, Jincheng ; Liu, Chun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c246t-73821802d5dcde90d8a44735e527f6bf13ccd116dd3b16c1d46bf673b31800ff3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Classification</topic><topic>Cloud computing</topic><topic>Convolution</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Feature extraction</topic><topic>Graph classification</topic><topic>Graphical representations</topic><topic>Graphs</topic><topic>indoor</topic><topic>Indoor environments</topic><topic>Instance segmentation</topic><topic>Point cloud compression</topic><topic>point clouds</topic><topic>Recognition</topic><topic>Scene analysis</topic><topic>scene graphs</topic><topic>scene recognition</topic><topic>Semantics</topic><topic>Structural members</topic><topic>Three-dimensional displays</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yue, Han</creatorcontrib><creatorcontrib>Lehtola, Ville</creatorcontrib><creatorcontrib>Wu, Hangbin</creatorcontrib><creatorcontrib>Vosselman, George</creatorcontrib><creatorcontrib>Li, Jincheng</creatorcontrib><creatorcontrib>Liu, Chun</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEL</collection><collection>CrossRef</collection><collection>Water Resources Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy &amp; Non-Living Resources</collection><collection>Civil Engineering Abstracts</collection><collection>Aquatic Science &amp; Fisheries Abstracts (ASFA) Professional</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on geoscience and remote sensing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yue, Han</au><au>Lehtola, Ville</au><au>Wu, Hangbin</au><au>Vosselman, George</au><au>Li, Jincheng</au><au>Liu, Chun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Recognition of Indoor Scenes Using 3-D Scene Graphs</atitle><jtitle>IEEE transactions on geoscience and remote sensing</jtitle><stitle>TGRS</stitle><date>2024</date><risdate>2024</risdate><volume>62</volume><spage>1</spage><epage>16</epage><pages>1-16</pages><issn>0196-2892</issn><eissn>1558-0644</eissn><coden>IGRSD2</coden><abstract>Scene recognition is a fundamental task in 3-D scene understanding. It answers the question, "What is this place?" In an indoor environment, the answer can be an office, kitchen, lobby, and so on. As the number of point clouds increases, using embedded point information in scene recognition becomes computationally heavy to process. To achieve computational efficiency and accurate classification, our idea is to use an indoor scene graph that represents the 3-D spatial structures via object instances. The proposed method comprises two parts, namely: 1) construction of indoor scene graphs leveraging object instances and their spatial relationships and 2) classification of these graphs using a deep learning network. Specifically, each indoor scene is represented by a graph, where each node represents either a structural element (like a ceiling, a wall, or a floor) or a piece of furniture (like a chair or a table), and each edge encodes the spatial relationship between these elements. Then, these graphs are used as input for our proposed graph classification network to learn different scene representations. The public indoor dataset, ScanNet v2, with 625.53 million points, is selected to test our method. Experiments yield good results with up to 88.00% accuracy and 82.30% F1 score in the fixed validation dataset and 90.46% accuracy and 81.45% F1 score in the ten-fold cross-validation method; moreover, if some indoor objects cannot be successfully identified, the scene classification accuracy depends sublinearly on the rate of missing objects in the scene.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TGRS.2024.3387556</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0001-9319-1640</orcidid><orcidid>https://orcid.org/0000-0001-8856-5919</orcidid><orcidid>https://orcid.org/0000-0001-8813-8028</orcidid><orcidid>https://orcid.org/0000-0002-4985-191X</orcidid><orcidid>https://orcid.org/0000-0002-6200-9065</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0196-2892
ispartof IEEE transactions on geoscience and remote sensing, 2024, Vol.62, p.1-16
issn 0196-2892
1558-0644
language eng
recordid cdi_proquest_journals_3044631678
source IEEE Electronic Library (IEL) Journals
subjects Accuracy
Classification
Cloud computing
Convolution
Datasets
Deep learning
Feature extraction
Graph classification
Graphical representations
Graphs
indoor
Indoor environments
Instance segmentation
Point cloud compression
point clouds
Recognition
Scene analysis
scene graphs
scene recognition
Semantics
Structural members
Three-dimensional displays
title Recognition of Indoor Scenes Using 3-D Scene Graphs
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T10%3A00%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Recognition%20of%20Indoor%20Scenes%20Using%203-D%20Scene%20Graphs&rft.jtitle=IEEE%20transactions%20on%20geoscience%20and%20remote%20sensing&rft.au=Yue,%20Han&rft.date=2024&rft.volume=62&rft.spage=1&rft.epage=16&rft.pages=1-16&rft.issn=0196-2892&rft.eissn=1558-0644&rft.coden=IGRSD2&rft_id=info:doi/10.1109/TGRS.2024.3387556&rft_dat=%3Cproquest_cross%3E3044631678%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c246t-73821802d5dcde90d8a44735e527f6bf13ccd116dd3b16c1d46bf673b31800ff3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3044631678&rft_id=info:pmid/&rft_ieee_id=10497127&rfr_iscdi=true