Loading…

Learning Intra-class Multimodal Distributions with Orthonormal Matrices

In this paper, we address the challenges of representing feature distributions which have multimodality within a class in deep neural networks. Existing online clustering methods employ sub-centroids to capture intra-class variations. However, conducting online clustering faces some limitations, i.e...

Full description

Saved in:
Bibliographic Details
Main Authors: Goto, Jumpei, Nakata, Yohei, Abe, Kiyofumi, Ishii, Yasunori, Yamashita, Takayoshi
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 1868
container_issue
container_start_page 1859
container_title
container_volume
creator Goto, Jumpei
Nakata, Yohei
Abe, Kiyofumi
Ishii, Yasunori
Yamashita, Takayoshi
description In this paper, we address the challenges of representing feature distributions which have multimodality within a class in deep neural networks. Existing online clustering methods employ sub-centroids to capture intra-class variations. However, conducting online clustering faces some limitations, i.e., online clustering assigns only a single sub-centroid to a feature vector extracted from a backbone and ignores the relationship between the other sub-centroids and the feature vector, and updating sub-centroids in an online clustering manner incurs significant storage costs. To address these limitations, we propose a novel method utilizing orthonormal matrices instead of sub-centroids for relaxing discrete assignments into continuous assignments. We update the orthonormal matrices using a gradient-based method, which eliminates the need for online clustering or additional storage. Experimental results on the CIFAR and ImageNet datasets exhibit that the proposed method outperforms current online clustering techniques in classification accuracy, sub-category discovery, and transferability, providing an efficient solution to the challenges posed by complex recognition targets.
doi_str_mv 10.1109/WACV57701.2024.00188
format conference_proceeding
fullrecord <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_10484104</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10484104</ieee_id><sourcerecordid>10484104</sourcerecordid><originalsourceid>FETCH-LOGICAL-i119t-7939c051c9d0dd77a38a438471ca4cce6580d9cdb093f6166349a7fa24c16a693</originalsourceid><addsrcrecordid>eNotjMtOAjEUQKuJiYj8AYv5gcF7p522d0lQkWQIGx9LcukUqRlmTFti_HtJcHPO4iRHiCnCDBHo4WO-eK-NAZxVUKkZAFp7JSZkyMoaJFqq4FqMKq2qkqTFW3GX0heAJCQ5EsvGc-xD_1ms-hy5dB2nVKxPXQ7HoeWueAwpx7A75TD0qfgJ-VBsYj4M_RCP57zmc3U-3YubPXfJT_49Fm_PT6-Ll7LZLFeLeVMGRMqlIUkOanTUQtsaw9KyklYZdKyc87q20JJrd0Byr1FrqYjNnivlULMmORbTyzd477ffMRw5_m4RlFVnyD-fB009</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Learning Intra-class Multimodal Distributions with Orthonormal Matrices</title><source>IEEE Xplore All Conference Series</source><creator>Goto, Jumpei ; Nakata, Yohei ; Abe, Kiyofumi ; Ishii, Yasunori ; Yamashita, Takayoshi</creator><creatorcontrib>Goto, Jumpei ; Nakata, Yohei ; Abe, Kiyofumi ; Ishii, Yasunori ; Yamashita, Takayoshi</creatorcontrib><description>In this paper, we address the challenges of representing feature distributions which have multimodality within a class in deep neural networks. Existing online clustering methods employ sub-centroids to capture intra-class variations. However, conducting online clustering faces some limitations, i.e., online clustering assigns only a single sub-centroid to a feature vector extracted from a backbone and ignores the relationship between the other sub-centroids and the feature vector, and updating sub-centroids in an online clustering manner incurs significant storage costs. To address these limitations, we propose a novel method utilizing orthonormal matrices instead of sub-centroids for relaxing discrete assignments into continuous assignments. We update the orthonormal matrices using a gradient-based method, which eliminates the need for online clustering or additional storage. Experimental results on the CIFAR and ImageNet datasets exhibit that the proposed method outperforms current online clustering techniques in classification accuracy, sub-category discovery, and transferability, providing an efficient solution to the challenges posed by complex recognition targets.</description><identifier>EISSN: 2642-9381</identifier><identifier>EISBN: 9798350318920</identifier><identifier>DOI: 10.1109/WACV57701.2024.00188</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>accountable ; Algorithms ; and algorithms ; Clustering methods ; Computational modeling ; Computer vision ; Costs ; ethical computer vision ; Explainable ; fair ; formulations ; Image recognition ; Image recognition and understanding ; Lead acid batteries ; Machine learning architectures ; privacy-preserving ; Target recognition</subject><ispartof>2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, p.1859-1868</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10484104$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10484104$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Goto, Jumpei</creatorcontrib><creatorcontrib>Nakata, Yohei</creatorcontrib><creatorcontrib>Abe, Kiyofumi</creatorcontrib><creatorcontrib>Ishii, Yasunori</creatorcontrib><creatorcontrib>Yamashita, Takayoshi</creatorcontrib><title>Learning Intra-class Multimodal Distributions with Orthonormal Matrices</title><title>2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)</title><addtitle>WACV</addtitle><description>In this paper, we address the challenges of representing feature distributions which have multimodality within a class in deep neural networks. Existing online clustering methods employ sub-centroids to capture intra-class variations. However, conducting online clustering faces some limitations, i.e., online clustering assigns only a single sub-centroid to a feature vector extracted from a backbone and ignores the relationship between the other sub-centroids and the feature vector, and updating sub-centroids in an online clustering manner incurs significant storage costs. To address these limitations, we propose a novel method utilizing orthonormal matrices instead of sub-centroids for relaxing discrete assignments into continuous assignments. We update the orthonormal matrices using a gradient-based method, which eliminates the need for online clustering or additional storage. Experimental results on the CIFAR and ImageNet datasets exhibit that the proposed method outperforms current online clustering techniques in classification accuracy, sub-category discovery, and transferability, providing an efficient solution to the challenges posed by complex recognition targets.</description><subject>accountable</subject><subject>Algorithms</subject><subject>and algorithms</subject><subject>Clustering methods</subject><subject>Computational modeling</subject><subject>Computer vision</subject><subject>Costs</subject><subject>ethical computer vision</subject><subject>Explainable</subject><subject>fair</subject><subject>formulations</subject><subject>Image recognition</subject><subject>Image recognition and understanding</subject><subject>Lead acid batteries</subject><subject>Machine learning architectures</subject><subject>privacy-preserving</subject><subject>Target recognition</subject><issn>2642-9381</issn><isbn>9798350318920</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2024</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotjMtOAjEUQKuJiYj8AYv5gcF7p522d0lQkWQIGx9LcukUqRlmTFti_HtJcHPO4iRHiCnCDBHo4WO-eK-NAZxVUKkZAFp7JSZkyMoaJFqq4FqMKq2qkqTFW3GX0heAJCQ5EsvGc-xD_1ms-hy5dB2nVKxPXQ7HoeWueAwpx7A75TD0qfgJ-VBsYj4M_RCP57zmc3U-3YubPXfJT_49Fm_PT6-Ll7LZLFeLeVMGRMqlIUkOanTUQtsaw9KyklYZdKyc87q20JJrd0Byr1FrqYjNnivlULMmORbTyzd477ffMRw5_m4RlFVnyD-fB009</recordid><startdate>20240103</startdate><enddate>20240103</enddate><creator>Goto, Jumpei</creator><creator>Nakata, Yohei</creator><creator>Abe, Kiyofumi</creator><creator>Ishii, Yasunori</creator><creator>Yamashita, Takayoshi</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>20240103</creationdate><title>Learning Intra-class Multimodal Distributions with Orthonormal Matrices</title><author>Goto, Jumpei ; Nakata, Yohei ; Abe, Kiyofumi ; Ishii, Yasunori ; Yamashita, Takayoshi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i119t-7939c051c9d0dd77a38a438471ca4cce6580d9cdb093f6166349a7fa24c16a693</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2024</creationdate><topic>accountable</topic><topic>Algorithms</topic><topic>and algorithms</topic><topic>Clustering methods</topic><topic>Computational modeling</topic><topic>Computer vision</topic><topic>Costs</topic><topic>ethical computer vision</topic><topic>Explainable</topic><topic>fair</topic><topic>formulations</topic><topic>Image recognition</topic><topic>Image recognition and understanding</topic><topic>Lead acid batteries</topic><topic>Machine learning architectures</topic><topic>privacy-preserving</topic><topic>Target recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Goto, Jumpei</creatorcontrib><creatorcontrib>Nakata, Yohei</creatorcontrib><creatorcontrib>Abe, Kiyofumi</creatorcontrib><creatorcontrib>Ishii, Yasunori</creatorcontrib><creatorcontrib>Yamashita, Takayoshi</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore Digital Library</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Goto, Jumpei</au><au>Nakata, Yohei</au><au>Abe, Kiyofumi</au><au>Ishii, Yasunori</au><au>Yamashita, Takayoshi</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Learning Intra-class Multimodal Distributions with Orthonormal Matrices</atitle><btitle>2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)</btitle><stitle>WACV</stitle><date>2024-01-03</date><risdate>2024</risdate><spage>1859</spage><epage>1868</epage><pages>1859-1868</pages><eissn>2642-9381</eissn><eisbn>9798350318920</eisbn><coden>IEEPAD</coden><abstract>In this paper, we address the challenges of representing feature distributions which have multimodality within a class in deep neural networks. Existing online clustering methods employ sub-centroids to capture intra-class variations. However, conducting online clustering faces some limitations, i.e., online clustering assigns only a single sub-centroid to a feature vector extracted from a backbone and ignores the relationship between the other sub-centroids and the feature vector, and updating sub-centroids in an online clustering manner incurs significant storage costs. To address these limitations, we propose a novel method utilizing orthonormal matrices instead of sub-centroids for relaxing discrete assignments into continuous assignments. We update the orthonormal matrices using a gradient-based method, which eliminates the need for online clustering or additional storage. Experimental results on the CIFAR and ImageNet datasets exhibit that the proposed method outperforms current online clustering techniques in classification accuracy, sub-category discovery, and transferability, providing an efficient solution to the challenges posed by complex recognition targets.</abstract><pub>IEEE</pub><doi>10.1109/WACV57701.2024.00188</doi><tpages>10</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier EISSN: 2642-9381
ispartof 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024, p.1859-1868
issn 2642-9381
language eng
recordid cdi_ieee_primary_10484104
source IEEE Xplore All Conference Series
subjects accountable
Algorithms
and algorithms
Clustering methods
Computational modeling
Computer vision
Costs
ethical computer vision
Explainable
fair
formulations
Image recognition
Image recognition and understanding
Lead acid batteries
Machine learning architectures
privacy-preserving
Target recognition
title Learning Intra-class Multimodal Distributions with Orthonormal Matrices
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T16%3A38%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Learning%20Intra-class%20Multimodal%20Distributions%20with%20Orthonormal%20Matrices&rft.btitle=2024%20IEEE/CVF%20Winter%20Conference%20on%20Applications%20of%20Computer%20Vision%20(WACV)&rft.au=Goto,%20Jumpei&rft.date=2024-01-03&rft.spage=1859&rft.epage=1868&rft.pages=1859-1868&rft.eissn=2642-9381&rft.coden=IEEPAD&rft_id=info:doi/10.1109/WACV57701.2024.00188&rft.eisbn=9798350318920&rft_dat=%3Cieee_CHZPO%3E10484104%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i119t-7939c051c9d0dd77a38a438471ca4cce6580d9cdb093f6166349a7fa24c16a693%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10484104&rfr_iscdi=true