Loading…
Scalable Robust Principal Component Analysis Using Grassmann Averages
In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunat...
Saved in:
Published in: | IEEE transactions on pattern analysis and machine intelligence 2016-11, Vol.38 (11), p.2298-2311 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c417t-fbd1ebf62dd0c3746d2101da273cf1eac94ab9327a1f10a41011ea9d7b81f22a3 |
---|---|
cites | cdi_FETCH-LOGICAL-c417t-fbd1ebf62dd0c3746d2101da273cf1eac94ab9327a1f10a41011ea9d7b81f22a3 |
container_end_page | 2311 |
container_issue | 11 |
container_start_page | 2298 |
container_title | IEEE transactions on pattern analysis and machine intelligence |
container_volume | 38 |
creator | Hauberg, Sren Feragen, Aasa Enficiaud, Raffi Black, Michael J. |
description | In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA are not scalable. We note that in a zero-mean dataset, each observation spans a one-dimensional subspace, giving a point on the Grassmann manifold. We show that the average subspace corresponds to the leading principal component for Gaussian data. We provide a simple algorithm for computing this Grassmann Average (GA), and show that the subspace estimate is less sensitive to outliers than PCA for general distributions. Because averages can be efficiently computed, we immediately gain scalability. We exploit robust averaging to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. The resulting Trimmed Grassmann Average (TGA) is appropriate for computer vision because it is robust to pixel outliers. The algorithm has linear computational complexity and minimal memory requirements. We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie; a task beyond any current method. Source code is available online. |
doi_str_mv | 10.1109/TPAMI.2015.2511743 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TPAMI_2015_2511743</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7364267</ieee_id><sourcerecordid>4223939651</sourcerecordid><originalsourceid>FETCH-LOGICAL-c417t-fbd1ebf62dd0c3746d2101da273cf1eac94ab9327a1f10a41011ea9d7b81f22a3</originalsourceid><addsrcrecordid>eNpdkNFKwzAUhoMobk5fQEEK3njTmZOkaXNZxpwDxaHbdUjbdHS0aU1aYW9v6-YuvDpwzvf_cD6EbgFPAbB4Wq_it-WUYAimJAAIGT1DYxBU-DSg4hyNMXDiRxGJRujKuR3GwAJML9GI8JACp2yM5p-pKlVSau-jTjrXeitbmLRoVOnN6qqpjTatFxtV7l3hvI0rzNZbWOVcpYzx4m9t1Va7a3SRq9Lpm-OcoM3zfD178V_fF8tZ_OqnDMLWz5MMdJJzkmU4pSHjGQEMmSIhTXPQKhVMJYKSUEEOWLH-2G9FFiYR5IQoOkGPh97G1l-ddq2sCpfqslRG152TEBHOAywC1qMP_9Bd3dn-kYGigHEAAnqKHKjU1s5ZncvGFpWyewlYDpLlr2Q5SJZHyX3o_ljdJZXOTpE_qz1wdwAKrfXpHFLOBuQHqhh_xQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1831005191</pqid></control><display><type>article</type><title>Scalable Robust Principal Component Analysis Using Grassmann Averages</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Hauberg, Sren ; Feragen, Aasa ; Enficiaud, Raffi ; Black, Michael J.</creator><creatorcontrib>Hauberg, Sren ; Feragen, Aasa ; Enficiaud, Raffi ; Black, Michael J.</creatorcontrib><description>In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA are not scalable. We note that in a zero-mean dataset, each observation spans a one-dimensional subspace, giving a point on the Grassmann manifold. We show that the average subspace corresponds to the leading principal component for Gaussian data. We provide a simple algorithm for computing this Grassmann Average (GA), and show that the subspace estimate is less sensitive to outliers than PCA for general distributions. Because averages can be efficiently computed, we immediately gain scalability. We exploit robust averaging to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. The resulting Trimmed Grassmann Average (TGA) is appropriate for computer vision because it is robust to pixel outliers. The algorithm has linear computational complexity and minimal memory requirements. We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie; a task beyond any current method. Source code is available online.</description><identifier>ISSN: 0162-8828</identifier><identifier>EISSN: 1939-3539</identifier><identifier>EISSN: 2160-9292</identifier><identifier>DOI: 10.1109/TPAMI.2015.2511743</identifier><identifier>PMID: 26731634</identifier><identifier>CODEN: ITPIDJ</identifier><language>eng</language><publisher>United States: IEEE</publisher><subject>Approximation methods ; Complexity theory ; Computer vision ; Dimensionality reduction ; Estimation ; Manifolds ; Principal component analysis ; Principal components analysis ; robust principal component analysis ; Robustness ; subspace estimation</subject><ispartof>IEEE transactions on pattern analysis and machine intelligence, 2016-11, Vol.38 (11), p.2298-2311</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2016</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c417t-fbd1ebf62dd0c3746d2101da273cf1eac94ab9327a1f10a41011ea9d7b81f22a3</citedby><cites>FETCH-LOGICAL-c417t-fbd1ebf62dd0c3746d2101da273cf1eac94ab9327a1f10a41011ea9d7b81f22a3</cites><orcidid>0000-0001-7223-877X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7364267$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/26731634$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Hauberg, Sren</creatorcontrib><creatorcontrib>Feragen, Aasa</creatorcontrib><creatorcontrib>Enficiaud, Raffi</creatorcontrib><creatorcontrib>Black, Michael J.</creatorcontrib><title>Scalable Robust Principal Component Analysis Using Grassmann Averages</title><title>IEEE transactions on pattern analysis and machine intelligence</title><addtitle>TPAMI</addtitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><description>In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA are not scalable. We note that in a zero-mean dataset, each observation spans a one-dimensional subspace, giving a point on the Grassmann manifold. We show that the average subspace corresponds to the leading principal component for Gaussian data. We provide a simple algorithm for computing this Grassmann Average (GA), and show that the subspace estimate is less sensitive to outliers than PCA for general distributions. Because averages can be efficiently computed, we immediately gain scalability. We exploit robust averaging to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. The resulting Trimmed Grassmann Average (TGA) is appropriate for computer vision because it is robust to pixel outliers. The algorithm has linear computational complexity and minimal memory requirements. We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie; a task beyond any current method. Source code is available online.</description><subject>Approximation methods</subject><subject>Complexity theory</subject><subject>Computer vision</subject><subject>Dimensionality reduction</subject><subject>Estimation</subject><subject>Manifolds</subject><subject>Principal component analysis</subject><subject>Principal components analysis</subject><subject>robust principal component analysis</subject><subject>Robustness</subject><subject>subspace estimation</subject><issn>0162-8828</issn><issn>1939-3539</issn><issn>2160-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><recordid>eNpdkNFKwzAUhoMobk5fQEEK3njTmZOkaXNZxpwDxaHbdUjbdHS0aU1aYW9v6-YuvDpwzvf_cD6EbgFPAbB4Wq_it-WUYAimJAAIGT1DYxBU-DSg4hyNMXDiRxGJRujKuR3GwAJML9GI8JACp2yM5p-pKlVSau-jTjrXeitbmLRoVOnN6qqpjTatFxtV7l3hvI0rzNZbWOVcpYzx4m9t1Va7a3SRq9Lpm-OcoM3zfD178V_fF8tZ_OqnDMLWz5MMdJJzkmU4pSHjGQEMmSIhTXPQKhVMJYKSUEEOWLH-2G9FFiYR5IQoOkGPh97G1l-ddq2sCpfqslRG152TEBHOAywC1qMP_9Bd3dn-kYGigHEAAnqKHKjU1s5ZncvGFpWyewlYDpLlr2Q5SJZHyX3o_ljdJZXOTpE_qz1wdwAKrfXpHFLOBuQHqhh_xQ</recordid><startdate>20161101</startdate><enddate>20161101</enddate><creator>Hauberg, Sren</creator><creator>Feragen, Aasa</creator><creator>Enficiaud, Raffi</creator><creator>Black, Michael J.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0001-7223-877X</orcidid></search><sort><creationdate>20161101</creationdate><title>Scalable Robust Principal Component Analysis Using Grassmann Averages</title><author>Hauberg, Sren ; Feragen, Aasa ; Enficiaud, Raffi ; Black, Michael J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c417t-fbd1ebf62dd0c3746d2101da273cf1eac94ab9327a1f10a41011ea9d7b81f22a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Approximation methods</topic><topic>Complexity theory</topic><topic>Computer vision</topic><topic>Dimensionality reduction</topic><topic>Estimation</topic><topic>Manifolds</topic><topic>Principal component analysis</topic><topic>Principal components analysis</topic><topic>robust principal component analysis</topic><topic>Robustness</topic><topic>subspace estimation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hauberg, Sren</creatorcontrib><creatorcontrib>Feragen, Aasa</creatorcontrib><creatorcontrib>Enficiaud, Raffi</creatorcontrib><creatorcontrib>Black, Michael J.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Xplore Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hauberg, Sren</au><au>Feragen, Aasa</au><au>Enficiaud, Raffi</au><au>Black, Michael J.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Scalable Robust Principal Component Analysis Using Grassmann Averages</atitle><jtitle>IEEE transactions on pattern analysis and machine intelligence</jtitle><stitle>TPAMI</stitle><addtitle>IEEE Trans Pattern Anal Mach Intell</addtitle><date>2016-11-01</date><risdate>2016</risdate><volume>38</volume><issue>11</issue><spage>2298</spage><epage>2311</epage><pages>2298-2311</pages><issn>0162-8828</issn><eissn>1939-3539</eissn><eissn>2160-9292</eissn><coden>ITPIDJ</coden><abstract>In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA are not scalable. We note that in a zero-mean dataset, each observation spans a one-dimensional subspace, giving a point on the Grassmann manifold. We show that the average subspace corresponds to the leading principal component for Gaussian data. We provide a simple algorithm for computing this Grassmann Average (GA), and show that the subspace estimate is less sensitive to outliers than PCA for general distributions. Because averages can be efficiently computed, we immediately gain scalability. We exploit robust averaging to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. The resulting Trimmed Grassmann Average (TGA) is appropriate for computer vision because it is robust to pixel outliers. The algorithm has linear computational complexity and minimal memory requirements. We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie; a task beyond any current method. Source code is available online.</abstract><cop>United States</cop><pub>IEEE</pub><pmid>26731634</pmid><doi>10.1109/TPAMI.2015.2511743</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0001-7223-877X</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0162-8828 |
ispartof | IEEE transactions on pattern analysis and machine intelligence, 2016-11, Vol.38 (11), p.2298-2311 |
issn | 0162-8828 1939-3539 2160-9292 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TPAMI_2015_2511743 |
source | IEEE Electronic Library (IEL) Journals |
subjects | Approximation methods Complexity theory Computer vision Dimensionality reduction Estimation Manifolds Principal component analysis Principal components analysis robust principal component analysis Robustness subspace estimation |
title | Scalable Robust Principal Component Analysis Using Grassmann Averages |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T16%3A22%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Scalable%20Robust%20Principal%20Component%20Analysis%20Using%20Grassmann%20Averages&rft.jtitle=IEEE%20transactions%20on%20pattern%20analysis%20and%20machine%20intelligence&rft.au=Hauberg,%20Sren&rft.date=2016-11-01&rft.volume=38&rft.issue=11&rft.spage=2298&rft.epage=2311&rft.pages=2298-2311&rft.issn=0162-8828&rft.eissn=1939-3539&rft.coden=ITPIDJ&rft_id=info:doi/10.1109/TPAMI.2015.2511743&rft_dat=%3Cproquest_cross%3E4223939651%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c417t-fbd1ebf62dd0c3746d2101da273cf1eac94ab9327a1f10a41011ea9d7b81f22a3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1831005191&rft_id=info:pmid/26731634&rft_ieee_id=7364267&rfr_iscdi=true |