Loading…
Recognition of Indoor Scenes Using 3-D Scene Graphs
Scene recognition is a fundamental task in 3-D scene understanding. It answers the question, "What is this place?" In an indoor environment, the answer can be an office, kitchen, lobby, and so on. As the number of point clouds increases, using embedded point information in scene recognitio...
Saved in:
Published in: | IEEE transactions on geoscience and remote sensing 2024, Vol.62, p.1-16 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c246t-73821802d5dcde90d8a44735e527f6bf13ccd116dd3b16c1d46bf673b31800ff3 |
container_end_page | 16 |
container_issue | |
container_start_page | 1 |
container_title | IEEE transactions on geoscience and remote sensing |
container_volume | 62 |
creator | Yue, Han Lehtola, Ville Wu, Hangbin Vosselman, George Li, Jincheng Liu, Chun |
description | Scene recognition is a fundamental task in 3-D scene understanding. It answers the question, "What is this place?" In an indoor environment, the answer can be an office, kitchen, lobby, and so on. As the number of point clouds increases, using embedded point information in scene recognition becomes computationally heavy to process. To achieve computational efficiency and accurate classification, our idea is to use an indoor scene graph that represents the 3-D spatial structures via object instances. The proposed method comprises two parts, namely: 1) construction of indoor scene graphs leveraging object instances and their spatial relationships and 2) classification of these graphs using a deep learning network. Specifically, each indoor scene is represented by a graph, where each node represents either a structural element (like a ceiling, a wall, or a floor) or a piece of furniture (like a chair or a table), and each edge encodes the spatial relationship between these elements. Then, these graphs are used as input for our proposed graph classification network to learn different scene representations. The public indoor dataset, ScanNet v2, with 625.53 million points, is selected to test our method. Experiments yield good results with up to 88.00% accuracy and 82.30% F1 score in the fixed validation dataset and 90.46% accuracy and 81.45% F1 score in the ten-fold cross-validation method; moreover, if some indoor objects cannot be successfully identified, the scene classification accuracy depends sublinearly on the rate of missing objects in the scene. |
doi_str_mv | 10.1109/TGRS.2024.3387556 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3044631678</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10497127</ieee_id><sourcerecordid>3044631678</sourcerecordid><originalsourceid>FETCH-LOGICAL-c246t-73821802d5dcde90d8a44735e527f6bf13ccd116dd3b16c1d46bf673b31800ff3</originalsourceid><addsrcrecordid>eNpNkEtLAzEUhYMoWKs_QHAx4Hpqbm5es5SqY6Eg9LEO0zzqFJ3UpF34750yXbi6cDjfufARcg90AkCrp1W9WE4YZXyCqJUQ8oKMQAhdUsn5JRlRqGTJdMWuyU3OO0qBC1Ajggtv47ZrD23sihiKWediTMXS-s7nYp3bbltg-TIERZ2a_We-JVeh-cr-7nzHZP32upq-l_OPejZ9npeWcXkoFWoGmjInnHW-ok43nCsUXjAV5CYAWusApHO4AWnB8T6UCjfYUzQEHJPHYXef4s_R54PZxWPq-pcGKecSQSrdt2Bo2RRzTj6YfWq_m_RrgJqTG3NyY05uzNlNzzwMTOu9_9fnlQKm8A_aEl3H</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3044631678</pqid></control><display><type>article</type><title>Recognition of Indoor Scenes Using 3-D Scene Graphs</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Yue, Han ; Lehtola, Ville ; Wu, Hangbin ; Vosselman, George ; Li, Jincheng ; Liu, Chun</creator><creatorcontrib>Yue, Han ; Lehtola, Ville ; Wu, Hangbin ; Vosselman, George ; Li, Jincheng ; Liu, Chun</creatorcontrib><description>Scene recognition is a fundamental task in 3-D scene understanding. It answers the question, "What is this place?" In an indoor environment, the answer can be an office, kitchen, lobby, and so on. As the number of point clouds increases, using embedded point information in scene recognition becomes computationally heavy to process. To achieve computational efficiency and accurate classification, our idea is to use an indoor scene graph that represents the 3-D spatial structures via object instances. The proposed method comprises two parts, namely: 1) construction of indoor scene graphs leveraging object instances and their spatial relationships and 2) classification of these graphs using a deep learning network. Specifically, each indoor scene is represented by a graph, where each node represents either a structural element (like a ceiling, a wall, or a floor) or a piece of furniture (like a chair or a table), and each edge encodes the spatial relationship between these elements. Then, these graphs are used as input for our proposed graph classification network to learn different scene representations. The public indoor dataset, ScanNet v2, with 625.53 million points, is selected to test our method. Experiments yield good results with up to 88.00% accuracy and 82.30% F1 score in the fixed validation dataset and 90.46% accuracy and 81.45% F1 score in the ten-fold cross-validation method; moreover, if some indoor objects cannot be successfully identified, the scene classification accuracy depends sublinearly on the rate of missing objects in the scene.</description><identifier>ISSN: 0196-2892</identifier><identifier>EISSN: 1558-0644</identifier><identifier>DOI: 10.1109/TGRS.2024.3387556</identifier><identifier>CODEN: IGRSD2</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Accuracy ; Classification ; Cloud computing ; Convolution ; Datasets ; Deep learning ; Feature extraction ; Graph classification ; Graphical representations ; Graphs ; indoor ; Indoor environments ; Instance segmentation ; Point cloud compression ; point clouds ; Recognition ; Scene analysis ; scene graphs ; scene recognition ; Semantics ; Structural members ; Three-dimensional displays</subject><ispartof>IEEE transactions on geoscience and remote sensing, 2024, Vol.62, p.1-16</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c246t-73821802d5dcde90d8a44735e527f6bf13ccd116dd3b16c1d46bf673b31800ff3</cites><orcidid>0000-0001-9319-1640 ; 0000-0001-8856-5919 ; 0000-0001-8813-8028 ; 0000-0002-4985-191X ; 0000-0002-6200-9065</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10497127$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,4010,27900,27901,27902,54771</link.rule.ids></links><search><creatorcontrib>Yue, Han</creatorcontrib><creatorcontrib>Lehtola, Ville</creatorcontrib><creatorcontrib>Wu, Hangbin</creatorcontrib><creatorcontrib>Vosselman, George</creatorcontrib><creatorcontrib>Li, Jincheng</creatorcontrib><creatorcontrib>Liu, Chun</creatorcontrib><title>Recognition of Indoor Scenes Using 3-D Scene Graphs</title><title>IEEE transactions on geoscience and remote sensing</title><addtitle>TGRS</addtitle><description>Scene recognition is a fundamental task in 3-D scene understanding. It answers the question, "What is this place?" In an indoor environment, the answer can be an office, kitchen, lobby, and so on. As the number of point clouds increases, using embedded point information in scene recognition becomes computationally heavy to process. To achieve computational efficiency and accurate classification, our idea is to use an indoor scene graph that represents the 3-D spatial structures via object instances. The proposed method comprises two parts, namely: 1) construction of indoor scene graphs leveraging object instances and their spatial relationships and 2) classification of these graphs using a deep learning network. Specifically, each indoor scene is represented by a graph, where each node represents either a structural element (like a ceiling, a wall, or a floor) or a piece of furniture (like a chair or a table), and each edge encodes the spatial relationship between these elements. Then, these graphs are used as input for our proposed graph classification network to learn different scene representations. The public indoor dataset, ScanNet v2, with 625.53 million points, is selected to test our method. Experiments yield good results with up to 88.00% accuracy and 82.30% F1 score in the fixed validation dataset and 90.46% accuracy and 81.45% F1 score in the ten-fold cross-validation method; moreover, if some indoor objects cannot be successfully identified, the scene classification accuracy depends sublinearly on the rate of missing objects in the scene.</description><subject>Accuracy</subject><subject>Classification</subject><subject>Cloud computing</subject><subject>Convolution</subject><subject>Datasets</subject><subject>Deep learning</subject><subject>Feature extraction</subject><subject>Graph classification</subject><subject>Graphical representations</subject><subject>Graphs</subject><subject>indoor</subject><subject>Indoor environments</subject><subject>Instance segmentation</subject><subject>Point cloud compression</subject><subject>point clouds</subject><subject>Recognition</subject><subject>Scene analysis</subject><subject>scene graphs</subject><subject>scene recognition</subject><subject>Semantics</subject><subject>Structural members</subject><subject>Three-dimensional displays</subject><issn>0196-2892</issn><issn>1558-0644</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpNkEtLAzEUhYMoWKs_QHAx4Hpqbm5es5SqY6Eg9LEO0zzqFJ3UpF34750yXbi6cDjfufARcg90AkCrp1W9WE4YZXyCqJUQ8oKMQAhdUsn5JRlRqGTJdMWuyU3OO0qBC1Ajggtv47ZrD23sihiKWediTMXS-s7nYp3bbltg-TIERZ2a_We-JVeh-cr-7nzHZP32upq-l_OPejZ9npeWcXkoFWoGmjInnHW-ok43nCsUXjAV5CYAWusApHO4AWnB8T6UCjfYUzQEHJPHYXef4s_R54PZxWPq-pcGKecSQSrdt2Bo2RRzTj6YfWq_m_RrgJqTG3NyY05uzNlNzzwMTOu9_9fnlQKm8A_aEl3H</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Yue, Han</creator><creator>Lehtola, Ville</creator><creator>Wu, Hangbin</creator><creator>Vosselman, George</creator><creator>Li, Jincheng</creator><creator>Liu, Chun</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7UA</scope><scope>8FD</scope><scope>C1K</scope><scope>F1W</scope><scope>FR3</scope><scope>H8D</scope><scope>H96</scope><scope>KR7</scope><scope>L.G</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0001-9319-1640</orcidid><orcidid>https://orcid.org/0000-0001-8856-5919</orcidid><orcidid>https://orcid.org/0000-0001-8813-8028</orcidid><orcidid>https://orcid.org/0000-0002-4985-191X</orcidid><orcidid>https://orcid.org/0000-0002-6200-9065</orcidid></search><sort><creationdate>2024</creationdate><title>Recognition of Indoor Scenes Using 3-D Scene Graphs</title><author>Yue, Han ; Lehtola, Ville ; Wu, Hangbin ; Vosselman, George ; Li, Jincheng ; Liu, Chun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c246t-73821802d5dcde90d8a44735e527f6bf13ccd116dd3b16c1d46bf673b31800ff3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Classification</topic><topic>Cloud computing</topic><topic>Convolution</topic><topic>Datasets</topic><topic>Deep learning</topic><topic>Feature extraction</topic><topic>Graph classification</topic><topic>Graphical representations</topic><topic>Graphs</topic><topic>indoor</topic><topic>Indoor environments</topic><topic>Instance segmentation</topic><topic>Point cloud compression</topic><topic>point clouds</topic><topic>Recognition</topic><topic>Scene analysis</topic><topic>scene graphs</topic><topic>scene recognition</topic><topic>Semantics</topic><topic>Structural members</topic><topic>Three-dimensional displays</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yue, Han</creatorcontrib><creatorcontrib>Lehtola, Ville</creatorcontrib><creatorcontrib>Wu, Hangbin</creatorcontrib><creatorcontrib>Vosselman, George</creatorcontrib><creatorcontrib>Li, Jincheng</creatorcontrib><creatorcontrib>Liu, Chun</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEL</collection><collection>CrossRef</collection><collection>Water Resources Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) 2: Ocean Technology, Policy & Non-Living Resources</collection><collection>Civil Engineering Abstracts</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) Professional</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on geoscience and remote sensing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yue, Han</au><au>Lehtola, Ville</au><au>Wu, Hangbin</au><au>Vosselman, George</au><au>Li, Jincheng</au><au>Liu, Chun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Recognition of Indoor Scenes Using 3-D Scene Graphs</atitle><jtitle>IEEE transactions on geoscience and remote sensing</jtitle><stitle>TGRS</stitle><date>2024</date><risdate>2024</risdate><volume>62</volume><spage>1</spage><epage>16</epage><pages>1-16</pages><issn>0196-2892</issn><eissn>1558-0644</eissn><coden>IGRSD2</coden><abstract>Scene recognition is a fundamental task in 3-D scene understanding. It answers the question, "What is this place?" In an indoor environment, the answer can be an office, kitchen, lobby, and so on. As the number of point clouds increases, using embedded point information in scene recognition becomes computationally heavy to process. To achieve computational efficiency and accurate classification, our idea is to use an indoor scene graph that represents the 3-D spatial structures via object instances. The proposed method comprises two parts, namely: 1) construction of indoor scene graphs leveraging object instances and their spatial relationships and 2) classification of these graphs using a deep learning network. Specifically, each indoor scene is represented by a graph, where each node represents either a structural element (like a ceiling, a wall, or a floor) or a piece of furniture (like a chair or a table), and each edge encodes the spatial relationship between these elements. Then, these graphs are used as input for our proposed graph classification network to learn different scene representations. The public indoor dataset, ScanNet v2, with 625.53 million points, is selected to test our method. Experiments yield good results with up to 88.00% accuracy and 82.30% F1 score in the fixed validation dataset and 90.46% accuracy and 81.45% F1 score in the ten-fold cross-validation method; moreover, if some indoor objects cannot be successfully identified, the scene classification accuracy depends sublinearly on the rate of missing objects in the scene.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TGRS.2024.3387556</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0001-9319-1640</orcidid><orcidid>https://orcid.org/0000-0001-8856-5919</orcidid><orcidid>https://orcid.org/0000-0001-8813-8028</orcidid><orcidid>https://orcid.org/0000-0002-4985-191X</orcidid><orcidid>https://orcid.org/0000-0002-6200-9065</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0196-2892 |
ispartof | IEEE transactions on geoscience and remote sensing, 2024, Vol.62, p.1-16 |
issn | 0196-2892 1558-0644 |
language | eng |
recordid | cdi_proquest_journals_3044631678 |
source | IEEE Electronic Library (IEL) Journals |
subjects | Accuracy Classification Cloud computing Convolution Datasets Deep learning Feature extraction Graph classification Graphical representations Graphs indoor Indoor environments Instance segmentation Point cloud compression point clouds Recognition Scene analysis scene graphs scene recognition Semantics Structural members Three-dimensional displays |
title | Recognition of Indoor Scenes Using 3-D Scene Graphs |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-02T10%3A00%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Recognition%20of%20Indoor%20Scenes%20Using%203-D%20Scene%20Graphs&rft.jtitle=IEEE%20transactions%20on%20geoscience%20and%20remote%20sensing&rft.au=Yue,%20Han&rft.date=2024&rft.volume=62&rft.spage=1&rft.epage=16&rft.pages=1-16&rft.issn=0196-2892&rft.eissn=1558-0644&rft.coden=IGRSD2&rft_id=info:doi/10.1109/TGRS.2024.3387556&rft_dat=%3Cproquest_cross%3E3044631678%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c246t-73821802d5dcde90d8a44735e527f6bf13ccd116dd3b16c1d46bf673b31800ff3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3044631678&rft_id=info:pmid/&rft_ieee_id=10497127&rfr_iscdi=true |