Loading…

A comparison of different procedures for principal component analysis in the presence of outliers

Principal component analysis (PCA) is a popular technique that is useful for dimensionality reduction but it is affected by the presence of outliers. The outlier sensitivity of classical PCA (CPCA) has caused the development of new approaches. Effects of using estimates obtained by expectation-maxim...

Full description

Saved in:
Bibliographic Details
Published in:Journal of applied statistics 2015-08, Vol.42 (8), p.1716-1722
Main Authors: Alkan, B. Bariş, Atakan, Cemal, Alkan, Nesrin
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c441t-ff59a6577d817c789337693556352c352a5596d1b8ecf3a4c87ec90c429a866e3
cites cdi_FETCH-LOGICAL-c441t-ff59a6577d817c789337693556352c352a5596d1b8ecf3a4c87ec90c429a866e3
container_end_page 1722
container_issue 8
container_start_page 1716
container_title Journal of applied statistics
container_volume 42
creator Alkan, B. Bariş
Atakan, Cemal
Alkan, Nesrin
description Principal component analysis (PCA) is a popular technique that is useful for dimensionality reduction but it is affected by the presence of outliers. The outlier sensitivity of classical PCA (CPCA) has caused the development of new approaches. Effects of using estimates obtained by expectation-maximization - EM and multiple imputation - MI instead of outliers were examined on the artificial and a real data set. Furthermore, robust PCA based on minimum covariance determinant (MCD), PCA based on estimates obtained by EM instead of outliers and PCA based on estimates obtained by MI instead of outliers were compared with the results of CPCA. In this study, we tried to show the effects of using estimates obtained by MI and EM instead of outliers, depending on the ratio of outliers in data set. Finally, when the ratio of outliers exceeds 20%, we suggest the use of estimates obtained by MI and EM instead of outliers as an alternative approach.
doi_str_mv 10.1080/02664763.2015.1005063
format article
fullrecord <record><control><sourceid>proquest_infor</sourceid><recordid>TN_cdi_proquest_journals_1687115365</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3710323831</sourcerecordid><originalsourceid>FETCH-LOGICAL-c441t-ff59a6577d817c789337693556352c352a5596d1b8ecf3a4c87ec90c429a866e3</originalsourceid><addsrcrecordid>eNp9kE1LAzEURYMoWKs_QRhw42ZqMpl8zM5S_IKCG12HmHnBlDSpyQzSf2_G6saFi_DI49xLchC6JHhBsMQ3uOG8FZwuGkxYWWGGOT1CM0I5rjGjzTGaTUw9QafoLOcNxlgSRmdILysTtzudXI6hirbqnbWQIAzVLkUD_ZggVzamcnXBuJ3234EYJkQH7ffZ5cqFaniHwkCGYGAqiuPgHaR8jk6s9hkufuYcvd7fvawe6_Xzw9Nqua5N25KhtpZ1mjMhekmEEbKjVPCOMsYpa0w5mrGO9-RNgrFUt0YKMB02bdNpyTnQObo-9JZ3f4yQB7V12YD3OkAcsyKCN1hIQdqCXv1BN3FM5S-F4oUoZjgrFDtQJsWcE1hVFGx12iuC1SRe_YpXk3j1I77kbg85F4q3rf6Myfdq0Hsfk026SMyK_l_xBTCziYY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1687115365</pqid></control><display><type>article</type><title>A comparison of different procedures for principal component analysis in the presence of outliers</title><source>Business Source Ultimate</source><source>Taylor and Francis Science and Technology Collection</source><creator>Alkan, B. Bariş ; Atakan, Cemal ; Alkan, Nesrin</creator><creatorcontrib>Alkan, B. Bariş ; Atakan, Cemal ; Alkan, Nesrin</creatorcontrib><description>Principal component analysis (PCA) is a popular technique that is useful for dimensionality reduction but it is affected by the presence of outliers. The outlier sensitivity of classical PCA (CPCA) has caused the development of new approaches. Effects of using estimates obtained by expectation-maximization - EM and multiple imputation - MI instead of outliers were examined on the artificial and a real data set. Furthermore, robust PCA based on minimum covariance determinant (MCD), PCA based on estimates obtained by EM instead of outliers and PCA based on estimates obtained by MI instead of outliers were compared with the results of CPCA. In this study, we tried to show the effects of using estimates obtained by MI and EM instead of outliers, depending on the ratio of outliers in data set. Finally, when the ratio of outliers exceeds 20%, we suggest the use of estimates obtained by MI and EM instead of outliers as an alternative approach.</description><identifier>ISSN: 0266-4763</identifier><identifier>EISSN: 1360-0532</identifier><identifier>DOI: 10.1080/02664763.2015.1005063</identifier><language>eng</language><publisher>Abingdon: Taylor &amp; Francis</publisher><subject>Applied statistics ; Covariance ; Data analysis ; Data collection ; Determinants ; Estimates ; Estimating techniques ; expectation-maximization ; missing value ; multiple imputation ; outliers ; Outliers (statistics) ; Principal component analysis ; Principal components analysis ; Reduction ; Statistical methods ; Studies</subject><ispartof>Journal of applied statistics, 2015-08, Vol.42 (8), p.1716-1722</ispartof><rights>2015 Taylor &amp; Francis 2015</rights><rights>Copyright Taylor &amp; Francis Ltd. 2015</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c441t-ff59a6577d817c789337693556352c352a5596d1b8ecf3a4c87ec90c429a866e3</citedby><cites>FETCH-LOGICAL-c441t-ff59a6577d817c789337693556352c352a5596d1b8ecf3a4c87ec90c429a866e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,777,781,27905,27906</link.rule.ids></links><search><creatorcontrib>Alkan, B. Bariş</creatorcontrib><creatorcontrib>Atakan, Cemal</creatorcontrib><creatorcontrib>Alkan, Nesrin</creatorcontrib><title>A comparison of different procedures for principal component analysis in the presence of outliers</title><title>Journal of applied statistics</title><description>Principal component analysis (PCA) is a popular technique that is useful for dimensionality reduction but it is affected by the presence of outliers. The outlier sensitivity of classical PCA (CPCA) has caused the development of new approaches. Effects of using estimates obtained by expectation-maximization - EM and multiple imputation - MI instead of outliers were examined on the artificial and a real data set. Furthermore, robust PCA based on minimum covariance determinant (MCD), PCA based on estimates obtained by EM instead of outliers and PCA based on estimates obtained by MI instead of outliers were compared with the results of CPCA. In this study, we tried to show the effects of using estimates obtained by MI and EM instead of outliers, depending on the ratio of outliers in data set. Finally, when the ratio of outliers exceeds 20%, we suggest the use of estimates obtained by MI and EM instead of outliers as an alternative approach.</description><subject>Applied statistics</subject><subject>Covariance</subject><subject>Data analysis</subject><subject>Data collection</subject><subject>Determinants</subject><subject>Estimates</subject><subject>Estimating techniques</subject><subject>expectation-maximization</subject><subject>missing value</subject><subject>multiple imputation</subject><subject>outliers</subject><subject>Outliers (statistics)</subject><subject>Principal component analysis</subject><subject>Principal components analysis</subject><subject>Reduction</subject><subject>Statistical methods</subject><subject>Studies</subject><issn>0266-4763</issn><issn>1360-0532</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LAzEURYMoWKs_QRhw42ZqMpl8zM5S_IKCG12HmHnBlDSpyQzSf2_G6saFi_DI49xLchC6JHhBsMQ3uOG8FZwuGkxYWWGGOT1CM0I5rjGjzTGaTUw9QafoLOcNxlgSRmdILysTtzudXI6hirbqnbWQIAzVLkUD_ZggVzamcnXBuJ3234EYJkQH7ffZ5cqFaniHwkCGYGAqiuPgHaR8jk6s9hkufuYcvd7fvawe6_Xzw9Nqua5N25KhtpZ1mjMhekmEEbKjVPCOMsYpa0w5mrGO9-RNgrFUt0YKMB02bdNpyTnQObo-9JZ3f4yQB7V12YD3OkAcsyKCN1hIQdqCXv1BN3FM5S-F4oUoZjgrFDtQJsWcE1hVFGx12iuC1SRe_YpXk3j1I77kbg85F4q3rf6Myfdq0Hsfk026SMyK_l_xBTCziYY</recordid><startdate>20150803</startdate><enddate>20150803</enddate><creator>Alkan, B. Bariş</creator><creator>Atakan, Cemal</creator><creator>Alkan, Nesrin</creator><general>Taylor &amp; Francis</general><general>Taylor &amp; Francis Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>H8D</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20150803</creationdate><title>A comparison of different procedures for principal component analysis in the presence of outliers</title><author>Alkan, B. Bariş ; Atakan, Cemal ; Alkan, Nesrin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c441t-ff59a6577d817c789337693556352c352a5596d1b8ecf3a4c87ec90c429a866e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Applied statistics</topic><topic>Covariance</topic><topic>Data analysis</topic><topic>Data collection</topic><topic>Determinants</topic><topic>Estimates</topic><topic>Estimating techniques</topic><topic>expectation-maximization</topic><topic>missing value</topic><topic>multiple imputation</topic><topic>outliers</topic><topic>Outliers (statistics)</topic><topic>Principal component analysis</topic><topic>Principal components analysis</topic><topic>Reduction</topic><topic>Statistical methods</topic><topic>Studies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Alkan, B. Bariş</creatorcontrib><creatorcontrib>Atakan, Cemal</creatorcontrib><creatorcontrib>Alkan, Nesrin</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Aerospace Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of applied statistics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Alkan, B. Bariş</au><au>Atakan, Cemal</au><au>Alkan, Nesrin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A comparison of different procedures for principal component analysis in the presence of outliers</atitle><jtitle>Journal of applied statistics</jtitle><date>2015-08-03</date><risdate>2015</risdate><volume>42</volume><issue>8</issue><spage>1716</spage><epage>1722</epage><pages>1716-1722</pages><issn>0266-4763</issn><eissn>1360-0532</eissn><abstract>Principal component analysis (PCA) is a popular technique that is useful for dimensionality reduction but it is affected by the presence of outliers. The outlier sensitivity of classical PCA (CPCA) has caused the development of new approaches. Effects of using estimates obtained by expectation-maximization - EM and multiple imputation - MI instead of outliers were examined on the artificial and a real data set. Furthermore, robust PCA based on minimum covariance determinant (MCD), PCA based on estimates obtained by EM instead of outliers and PCA based on estimates obtained by MI instead of outliers were compared with the results of CPCA. In this study, we tried to show the effects of using estimates obtained by MI and EM instead of outliers, depending on the ratio of outliers in data set. Finally, when the ratio of outliers exceeds 20%, we suggest the use of estimates obtained by MI and EM instead of outliers as an alternative approach.</abstract><cop>Abingdon</cop><pub>Taylor &amp; Francis</pub><doi>10.1080/02664763.2015.1005063</doi><tpages>7</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0266-4763
ispartof Journal of applied statistics, 2015-08, Vol.42 (8), p.1716-1722
issn 0266-4763
1360-0532
language eng
recordid cdi_proquest_journals_1687115365
source Business Source Ultimate; Taylor and Francis Science and Technology Collection
subjects Applied statistics
Covariance
Data analysis
Data collection
Determinants
Estimates
Estimating techniques
expectation-maximization
missing value
multiple imputation
outliers
Outliers (statistics)
Principal component analysis
Principal components analysis
Reduction
Statistical methods
Studies
title A comparison of different procedures for principal component analysis in the presence of outliers
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T22%3A51%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_infor&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20comparison%20of%20different%20procedures%20for%20principal%20component%20analysis%20in%20the%20presence%20of%20outliers&rft.jtitle=Journal%20of%20applied%20statistics&rft.au=Alkan,%20B.%20Bari%C5%9F&rft.date=2015-08-03&rft.volume=42&rft.issue=8&rft.spage=1716&rft.epage=1722&rft.pages=1716-1722&rft.issn=0266-4763&rft.eissn=1360-0532&rft_id=info:doi/10.1080/02664763.2015.1005063&rft_dat=%3Cproquest_infor%3E3710323831%3C/proquest_infor%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c441t-ff59a6577d817c789337693556352c352a5596d1b8ecf3a4c87ec90c429a866e3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1687115365&rft_id=info:pmid/&rfr_iscdi=true