Loading…
Single Camera Face Position-Invariant Driver’s Gaze Zone Classifier Based on Frame-Sequence Recognition Using 3D Convolutional Neural Networks
Estimating the driver’s gaze in a natural real-world setting can be problematic for different challenging scenario conditions. For example, faces will undergo facial occlusions, illumination, or various face positions while driving. In this effort, we aim to reduce misclassifications in driving situ...
Saved in:
Published in: | Sensors (Basel, Switzerland) Switzerland), 2022-08, Vol.22 (15), p.5857 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c472t-1940399285ca24bf13e9d0d4ca8ec704f3354c7696e11f355db9dea80f94c7943 |
container_end_page | |
container_issue | 15 |
container_start_page | 5857 |
container_title | Sensors (Basel, Switzerland) |
container_volume | 22 |
creator | Lollett, Catherine Kamezaki, Mitsuhiro Sugano, Shigeki |
description | Estimating the driver’s gaze in a natural real-world setting can be problematic for different challenging scenario conditions. For example, faces will undergo facial occlusions, illumination, or various face positions while driving. In this effort, we aim to reduce misclassifications in driving situations when the driver has different face distances regarding the camera. Three-dimensional Convolutional Neural Networks (CNN) models can make a spatio-temporal driver’s representation that extracts features encoded in multiple adjacent frames that can describe motions. This characteristic may help ease the deficiencies of a per-frame recognition system due to the lack of context information. For example, the front, navigator, right window, left window, back mirror, and speed meter are part of the known common areas to be checked by drivers. Based on this, we implement and evaluate a model that is able to detect the head direction toward these regions having various distances from the camera. In our evaluation, the 2D CNN model had a mean average recall of 74.96% across the three models, whereas the 3D CNN model had a mean average recall of 87.02%. This result show that our proposed 3D CNN-based approach outperforms a 2D CNN per-frame recognition approach in driving situations when the driver’s face has different distances from the camera. |
doi_str_mv | 10.3390/s22155857 |
format | article |
fullrecord | <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_e56104476d304bf0bdafd407733f0d7f</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_e56104476d304bf0bdafd407733f0d7f</doaj_id><sourcerecordid>2700764980</sourcerecordid><originalsourceid>FETCH-LOGICAL-c472t-1940399285ca24bf13e9d0d4ca8ec704f3354c7696e11f355db9dea80f94c7943</originalsourceid><addsrcrecordid>eNpdks9uEzEQxlcIREvhwBtY4gKHBf_btX1BoikpkSpAlF64WI49Dg4bu9i7QXDiEbjyejwJblJVlNOMPn_zm_FomuYxwc8ZU_hFoZR0nezEneaQcMpbSSm--09-0DwoZY0xZYzJ-80B61QnOKGHza_zEFcDoJnZQDZobiyg96mEMaTYLuLW5GDiiE5y2EL-8_N3QafmB6BPKdaawZQSfICMjk0Bh1JE81xB7Tl8nSBW1AewaRV3NHRRaivETtAsxW0apivRDOgtTHkXxm8pfykPm3veDAUeXcej5mL--uPsTXv27nQxe3XWWi7o2BLFMVOKys4aypeeMFAOO26NBCsw94x13Ipe9UCIZ13nlsqBkdirKivOjprFnuuSWevLHDYmf9fJBL0TUl5pk8dgB9DQ9QRzLnrHcG2Fl854x7EQjHnshK-sl3vW5bTcgLMQx_qlW9DbLzF81qu01YoJLHtaAU-vATnVzZVRb0KxMAwmQpqKpgJTIrlgslqf_GddpynXRe5cWPRcSVxdz_Yum1MpGfzNMATrq5vRNzfD_gJRq7P1</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2700764980</pqid></control><display><type>article</type><title>Single Camera Face Position-Invariant Driver’s Gaze Zone Classifier Based on Frame-Sequence Recognition Using 3D Convolutional Neural Networks</title><source>NCBI_PubMed Central(免费)</source><source>Publicly Available Content Database</source><creator>Lollett, Catherine ; Kamezaki, Mitsuhiro ; Sugano, Shigeki</creator><creatorcontrib>Lollett, Catherine ; Kamezaki, Mitsuhiro ; Sugano, Shigeki</creatorcontrib><description>Estimating the driver’s gaze in a natural real-world setting can be problematic for different challenging scenario conditions. For example, faces will undergo facial occlusions, illumination, or various face positions while driving. In this effort, we aim to reduce misclassifications in driving situations when the driver has different face distances regarding the camera. Three-dimensional Convolutional Neural Networks (CNN) models can make a spatio-temporal driver’s representation that extracts features encoded in multiple adjacent frames that can describe motions. This characteristic may help ease the deficiencies of a per-frame recognition system due to the lack of context information. For example, the front, navigator, right window, left window, back mirror, and speed meter are part of the known common areas to be checked by drivers. Based on this, we implement and evaluate a model that is able to detect the head direction toward these regions having various distances from the camera. In our evaluation, the 2D CNN model had a mean average recall of 74.96% across the three models, whereas the 3D CNN model had a mean average recall of 87.02%. This result show that our proposed 3D CNN-based approach outperforms a 2D CNN per-frame recognition approach in driving situations when the driver’s face has different distances from the camera.</description><identifier>ISSN: 1424-8220</identifier><identifier>EISSN: 1424-8220</identifier><identifier>DOI: 10.3390/s22155857</identifier><identifier>PMID: 35957412</identifier><language>eng</language><publisher>Basel: MDPI AG</publisher><subject>Cameras ; Classification ; convolutional neural networks ; Datasets ; Driver behavior ; driver monitoring ; Evaluation ; gaze classification ; Support vector machines ; Two dimensional models</subject><ispartof>Sensors (Basel, Switzerland), 2022-08, Vol.22 (15), p.5857</ispartof><rights>2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>2022 by the authors. 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c472t-1940399285ca24bf13e9d0d4ca8ec704f3354c7696e11f355db9dea80f94c7943</cites><orcidid>0000-0001-7478-423X ; 0000-0002-9331-2446</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2700764980/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2700764980?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,25751,27922,27923,37010,37011,44588,53789,53791,74896</link.rule.ids></links><search><creatorcontrib>Lollett, Catherine</creatorcontrib><creatorcontrib>Kamezaki, Mitsuhiro</creatorcontrib><creatorcontrib>Sugano, Shigeki</creatorcontrib><title>Single Camera Face Position-Invariant Driver’s Gaze Zone Classifier Based on Frame-Sequence Recognition Using 3D Convolutional Neural Networks</title><title>Sensors (Basel, Switzerland)</title><description>Estimating the driver’s gaze in a natural real-world setting can be problematic for different challenging scenario conditions. For example, faces will undergo facial occlusions, illumination, or various face positions while driving. In this effort, we aim to reduce misclassifications in driving situations when the driver has different face distances regarding the camera. Three-dimensional Convolutional Neural Networks (CNN) models can make a spatio-temporal driver’s representation that extracts features encoded in multiple adjacent frames that can describe motions. This characteristic may help ease the deficiencies of a per-frame recognition system due to the lack of context information. For example, the front, navigator, right window, left window, back mirror, and speed meter are part of the known common areas to be checked by drivers. Based on this, we implement and evaluate a model that is able to detect the head direction toward these regions having various distances from the camera. In our evaluation, the 2D CNN model had a mean average recall of 74.96% across the three models, whereas the 3D CNN model had a mean average recall of 87.02%. This result show that our proposed 3D CNN-based approach outperforms a 2D CNN per-frame recognition approach in driving situations when the driver’s face has different distances from the camera.</description><subject>Cameras</subject><subject>Classification</subject><subject>convolutional neural networks</subject><subject>Datasets</subject><subject>Driver behavior</subject><subject>driver monitoring</subject><subject>Evaluation</subject><subject>gaze classification</subject><subject>Support vector machines</subject><subject>Two dimensional models</subject><issn>1424-8220</issn><issn>1424-8220</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNpdks9uEzEQxlcIREvhwBtY4gKHBf_btX1BoikpkSpAlF64WI49Dg4bu9i7QXDiEbjyejwJblJVlNOMPn_zm_FomuYxwc8ZU_hFoZR0nezEneaQcMpbSSm--09-0DwoZY0xZYzJ-80B61QnOKGHza_zEFcDoJnZQDZobiyg96mEMaTYLuLW5GDiiE5y2EL-8_N3QafmB6BPKdaawZQSfICMjk0Bh1JE81xB7Tl8nSBW1AewaRV3NHRRaivETtAsxW0apivRDOgtTHkXxm8pfykPm3veDAUeXcej5mL--uPsTXv27nQxe3XWWi7o2BLFMVOKys4aypeeMFAOO26NBCsw94x13Ipe9UCIZ13nlsqBkdirKivOjprFnuuSWevLHDYmf9fJBL0TUl5pk8dgB9DQ9QRzLnrHcG2Fl854x7EQjHnshK-sl3vW5bTcgLMQx_qlW9DbLzF81qu01YoJLHtaAU-vATnVzZVRb0KxMAwmQpqKpgJTIrlgslqf_GddpynXRe5cWPRcSVxdz_Yum1MpGfzNMATrq5vRNzfD_gJRq7P1</recordid><startdate>20220805</startdate><enddate>20220805</enddate><creator>Lollett, Catherine</creator><creator>Kamezaki, Mitsuhiro</creator><creator>Sugano, Shigeki</creator><general>MDPI AG</general><general>MDPI</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>K9.</scope><scope>M0S</scope><scope>M1P</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-7478-423X</orcidid><orcidid>https://orcid.org/0000-0002-9331-2446</orcidid></search><sort><creationdate>20220805</creationdate><title>Single Camera Face Position-Invariant Driver’s Gaze Zone Classifier Based on Frame-Sequence Recognition Using 3D Convolutional Neural Networks</title><author>Lollett, Catherine ; Kamezaki, Mitsuhiro ; Sugano, Shigeki</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c472t-1940399285ca24bf13e9d0d4ca8ec704f3354c7696e11f355db9dea80f94c7943</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Cameras</topic><topic>Classification</topic><topic>convolutional neural networks</topic><topic>Datasets</topic><topic>Driver behavior</topic><topic>driver monitoring</topic><topic>Evaluation</topic><topic>gaze classification</topic><topic>Support vector machines</topic><topic>Two dimensional models</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lollett, Catherine</creatorcontrib><creatorcontrib>Kamezaki, Mitsuhiro</creatorcontrib><creatorcontrib>Sugano, Shigeki</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>ProQuest_Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>Sensors (Basel, Switzerland)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lollett, Catherine</au><au>Kamezaki, Mitsuhiro</au><au>Sugano, Shigeki</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Single Camera Face Position-Invariant Driver’s Gaze Zone Classifier Based on Frame-Sequence Recognition Using 3D Convolutional Neural Networks</atitle><jtitle>Sensors (Basel, Switzerland)</jtitle><date>2022-08-05</date><risdate>2022</risdate><volume>22</volume><issue>15</issue><spage>5857</spage><pages>5857-</pages><issn>1424-8220</issn><eissn>1424-8220</eissn><abstract>Estimating the driver’s gaze in a natural real-world setting can be problematic for different challenging scenario conditions. For example, faces will undergo facial occlusions, illumination, or various face positions while driving. In this effort, we aim to reduce misclassifications in driving situations when the driver has different face distances regarding the camera. Three-dimensional Convolutional Neural Networks (CNN) models can make a spatio-temporal driver’s representation that extracts features encoded in multiple adjacent frames that can describe motions. This characteristic may help ease the deficiencies of a per-frame recognition system due to the lack of context information. For example, the front, navigator, right window, left window, back mirror, and speed meter are part of the known common areas to be checked by drivers. Based on this, we implement and evaluate a model that is able to detect the head direction toward these regions having various distances from the camera. In our evaluation, the 2D CNN model had a mean average recall of 74.96% across the three models, whereas the 3D CNN model had a mean average recall of 87.02%. This result show that our proposed 3D CNN-based approach outperforms a 2D CNN per-frame recognition approach in driving situations when the driver’s face has different distances from the camera.</abstract><cop>Basel</cop><pub>MDPI AG</pub><pmid>35957412</pmid><doi>10.3390/s22155857</doi><orcidid>https://orcid.org/0000-0001-7478-423X</orcidid><orcidid>https://orcid.org/0000-0002-9331-2446</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1424-8220 |
ispartof | Sensors (Basel, Switzerland), 2022-08, Vol.22 (15), p.5857 |
issn | 1424-8220 1424-8220 |
language | eng |
recordid | cdi_doaj_primary_oai_doaj_org_article_e56104476d304bf0bdafd407733f0d7f |
source | NCBI_PubMed Central(免费); Publicly Available Content Database |
subjects | Cameras Classification convolutional neural networks Datasets Driver behavior driver monitoring Evaluation gaze classification Support vector machines Two dimensional models |
title | Single Camera Face Position-Invariant Driver’s Gaze Zone Classifier Based on Frame-Sequence Recognition Using 3D Convolutional Neural Networks |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T12%3A07%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Single%20Camera%20Face%20Position-Invariant%20Driver%E2%80%99s%20Gaze%20Zone%20Classifier%20Based%20on%20Frame-Sequence%20Recognition%20Using%203D%20Convolutional%20Neural%20Networks&rft.jtitle=Sensors%20(Basel,%20Switzerland)&rft.au=Lollett,%20Catherine&rft.date=2022-08-05&rft.volume=22&rft.issue=15&rft.spage=5857&rft.pages=5857-&rft.issn=1424-8220&rft.eissn=1424-8220&rft_id=info:doi/10.3390/s22155857&rft_dat=%3Cproquest_doaj_%3E2700764980%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c472t-1940399285ca24bf13e9d0d4ca8ec704f3354c7696e11f355db9dea80f94c7943%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2700764980&rft_id=info:pmid/35957412&rfr_iscdi=true |