Loading…
A comparison of different procedures for principal component analysis in the presence of outliers
Principal component analysis (PCA) is a popular technique that is useful for dimensionality reduction but it is affected by the presence of outliers. The outlier sensitivity of classical PCA (CPCA) has caused the development of new approaches. Effects of using estimates obtained by expectation-maxim...
Saved in:
Published in: | Journal of applied statistics 2015-08, Vol.42 (8), p.1716-1722 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c441t-ff59a6577d817c789337693556352c352a5596d1b8ecf3a4c87ec90c429a866e3 |
---|---|
cites | cdi_FETCH-LOGICAL-c441t-ff59a6577d817c789337693556352c352a5596d1b8ecf3a4c87ec90c429a866e3 |
container_end_page | 1722 |
container_issue | 8 |
container_start_page | 1716 |
container_title | Journal of applied statistics |
container_volume | 42 |
creator | Alkan, B. Bariş Atakan, Cemal Alkan, Nesrin |
description | Principal component analysis (PCA) is a popular technique that is useful for dimensionality reduction but it is affected by the presence of outliers. The outlier sensitivity of classical PCA (CPCA) has caused the development of new approaches. Effects of using estimates obtained by expectation-maximization - EM and multiple imputation - MI instead of outliers were examined on the artificial and a real data set. Furthermore, robust PCA based on minimum covariance determinant (MCD), PCA based on estimates obtained by EM instead of outliers and PCA based on estimates obtained by MI instead of outliers were compared with the results of CPCA. In this study, we tried to show the effects of using estimates obtained by MI and EM instead of outliers, depending on the ratio of outliers in data set. Finally, when the ratio of outliers exceeds 20%, we suggest the use of estimates obtained by MI and EM instead of outliers as an alternative approach. |
doi_str_mv | 10.1080/02664763.2015.1005063 |
format | article |
fullrecord | <record><control><sourceid>proquest_infor</sourceid><recordid>TN_cdi_proquest_journals_1687115365</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3710323831</sourcerecordid><originalsourceid>FETCH-LOGICAL-c441t-ff59a6577d817c789337693556352c352a5596d1b8ecf3a4c87ec90c429a866e3</originalsourceid><addsrcrecordid>eNp9kE1LAzEURYMoWKs_QRhw42ZqMpl8zM5S_IKCG12HmHnBlDSpyQzSf2_G6saFi_DI49xLchC6JHhBsMQ3uOG8FZwuGkxYWWGGOT1CM0I5rjGjzTGaTUw9QafoLOcNxlgSRmdILysTtzudXI6hirbqnbWQIAzVLkUD_ZggVzamcnXBuJ3234EYJkQH7ffZ5cqFaniHwkCGYGAqiuPgHaR8jk6s9hkufuYcvd7fvawe6_Xzw9Nqua5N25KhtpZ1mjMhekmEEbKjVPCOMsYpa0w5mrGO9-RNgrFUt0YKMB02bdNpyTnQObo-9JZ3f4yQB7V12YD3OkAcsyKCN1hIQdqCXv1BN3FM5S-F4oUoZjgrFDtQJsWcE1hVFGx12iuC1SRe_YpXk3j1I77kbg85F4q3rf6Myfdq0Hsfk026SMyK_l_xBTCziYY</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1687115365</pqid></control><display><type>article</type><title>A comparison of different procedures for principal component analysis in the presence of outliers</title><source>Business Source Ultimate</source><source>Taylor and Francis Science and Technology Collection</source><creator>Alkan, B. Bariş ; Atakan, Cemal ; Alkan, Nesrin</creator><creatorcontrib>Alkan, B. Bariş ; Atakan, Cemal ; Alkan, Nesrin</creatorcontrib><description>Principal component analysis (PCA) is a popular technique that is useful for dimensionality reduction but it is affected by the presence of outliers. The outlier sensitivity of classical PCA (CPCA) has caused the development of new approaches. Effects of using estimates obtained by expectation-maximization - EM and multiple imputation - MI instead of outliers were examined on the artificial and a real data set. Furthermore, robust PCA based on minimum covariance determinant (MCD), PCA based on estimates obtained by EM instead of outliers and PCA based on estimates obtained by MI instead of outliers were compared with the results of CPCA. In this study, we tried to show the effects of using estimates obtained by MI and EM instead of outliers, depending on the ratio of outliers in data set. Finally, when the ratio of outliers exceeds 20%, we suggest the use of estimates obtained by MI and EM instead of outliers as an alternative approach.</description><identifier>ISSN: 0266-4763</identifier><identifier>EISSN: 1360-0532</identifier><identifier>DOI: 10.1080/02664763.2015.1005063</identifier><language>eng</language><publisher>Abingdon: Taylor & Francis</publisher><subject>Applied statistics ; Covariance ; Data analysis ; Data collection ; Determinants ; Estimates ; Estimating techniques ; expectation-maximization ; missing value ; multiple imputation ; outliers ; Outliers (statistics) ; Principal component analysis ; Principal components analysis ; Reduction ; Statistical methods ; Studies</subject><ispartof>Journal of applied statistics, 2015-08, Vol.42 (8), p.1716-1722</ispartof><rights>2015 Taylor & Francis 2015</rights><rights>Copyright Taylor & Francis Ltd. 2015</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c441t-ff59a6577d817c789337693556352c352a5596d1b8ecf3a4c87ec90c429a866e3</citedby><cites>FETCH-LOGICAL-c441t-ff59a6577d817c789337693556352c352a5596d1b8ecf3a4c87ec90c429a866e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,777,781,27905,27906</link.rule.ids></links><search><creatorcontrib>Alkan, B. Bariş</creatorcontrib><creatorcontrib>Atakan, Cemal</creatorcontrib><creatorcontrib>Alkan, Nesrin</creatorcontrib><title>A comparison of different procedures for principal component analysis in the presence of outliers</title><title>Journal of applied statistics</title><description>Principal component analysis (PCA) is a popular technique that is useful for dimensionality reduction but it is affected by the presence of outliers. The outlier sensitivity of classical PCA (CPCA) has caused the development of new approaches. Effects of using estimates obtained by expectation-maximization - EM and multiple imputation - MI instead of outliers were examined on the artificial and a real data set. Furthermore, robust PCA based on minimum covariance determinant (MCD), PCA based on estimates obtained by EM instead of outliers and PCA based on estimates obtained by MI instead of outliers were compared with the results of CPCA. In this study, we tried to show the effects of using estimates obtained by MI and EM instead of outliers, depending on the ratio of outliers in data set. Finally, when the ratio of outliers exceeds 20%, we suggest the use of estimates obtained by MI and EM instead of outliers as an alternative approach.</description><subject>Applied statistics</subject><subject>Covariance</subject><subject>Data analysis</subject><subject>Data collection</subject><subject>Determinants</subject><subject>Estimates</subject><subject>Estimating techniques</subject><subject>expectation-maximization</subject><subject>missing value</subject><subject>multiple imputation</subject><subject>outliers</subject><subject>Outliers (statistics)</subject><subject>Principal component analysis</subject><subject>Principal components analysis</subject><subject>Reduction</subject><subject>Statistical methods</subject><subject>Studies</subject><issn>0266-4763</issn><issn>1360-0532</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LAzEURYMoWKs_QRhw42ZqMpl8zM5S_IKCG12HmHnBlDSpyQzSf2_G6saFi_DI49xLchC6JHhBsMQ3uOG8FZwuGkxYWWGGOT1CM0I5rjGjzTGaTUw9QafoLOcNxlgSRmdILysTtzudXI6hirbqnbWQIAzVLkUD_ZggVzamcnXBuJ3234EYJkQH7ffZ5cqFaniHwkCGYGAqiuPgHaR8jk6s9hkufuYcvd7fvawe6_Xzw9Nqua5N25KhtpZ1mjMhekmEEbKjVPCOMsYpa0w5mrGO9-RNgrFUt0YKMB02bdNpyTnQObo-9JZ3f4yQB7V12YD3OkAcsyKCN1hIQdqCXv1BN3FM5S-F4oUoZjgrFDtQJsWcE1hVFGx12iuC1SRe_YpXk3j1I77kbg85F4q3rf6Myfdq0Hsfk026SMyK_l_xBTCziYY</recordid><startdate>20150803</startdate><enddate>20150803</enddate><creator>Alkan, B. Bariş</creator><creator>Atakan, Cemal</creator><creator>Alkan, Nesrin</creator><general>Taylor & Francis</general><general>Taylor & Francis Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>H8D</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20150803</creationdate><title>A comparison of different procedures for principal component analysis in the presence of outliers</title><author>Alkan, B. Bariş ; Atakan, Cemal ; Alkan, Nesrin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c441t-ff59a6577d817c789337693556352c352a5596d1b8ecf3a4c87ec90c429a866e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Applied statistics</topic><topic>Covariance</topic><topic>Data analysis</topic><topic>Data collection</topic><topic>Determinants</topic><topic>Estimates</topic><topic>Estimating techniques</topic><topic>expectation-maximization</topic><topic>missing value</topic><topic>multiple imputation</topic><topic>outliers</topic><topic>Outliers (statistics)</topic><topic>Principal component analysis</topic><topic>Principal components analysis</topic><topic>Reduction</topic><topic>Statistical methods</topic><topic>Studies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Alkan, B. Bariş</creatorcontrib><creatorcontrib>Atakan, Cemal</creatorcontrib><creatorcontrib>Alkan, Nesrin</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Aerospace Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of applied statistics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Alkan, B. Bariş</au><au>Atakan, Cemal</au><au>Alkan, Nesrin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A comparison of different procedures for principal component analysis in the presence of outliers</atitle><jtitle>Journal of applied statistics</jtitle><date>2015-08-03</date><risdate>2015</risdate><volume>42</volume><issue>8</issue><spage>1716</spage><epage>1722</epage><pages>1716-1722</pages><issn>0266-4763</issn><eissn>1360-0532</eissn><abstract>Principal component analysis (PCA) is a popular technique that is useful for dimensionality reduction but it is affected by the presence of outliers. The outlier sensitivity of classical PCA (CPCA) has caused the development of new approaches. Effects of using estimates obtained by expectation-maximization - EM and multiple imputation - MI instead of outliers were examined on the artificial and a real data set. Furthermore, robust PCA based on minimum covariance determinant (MCD), PCA based on estimates obtained by EM instead of outliers and PCA based on estimates obtained by MI instead of outliers were compared with the results of CPCA. In this study, we tried to show the effects of using estimates obtained by MI and EM instead of outliers, depending on the ratio of outliers in data set. Finally, when the ratio of outliers exceeds 20%, we suggest the use of estimates obtained by MI and EM instead of outliers as an alternative approach.</abstract><cop>Abingdon</cop><pub>Taylor & Francis</pub><doi>10.1080/02664763.2015.1005063</doi><tpages>7</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0266-4763 |
ispartof | Journal of applied statistics, 2015-08, Vol.42 (8), p.1716-1722 |
issn | 0266-4763 1360-0532 |
language | eng |
recordid | cdi_proquest_journals_1687115365 |
source | Business Source Ultimate; Taylor and Francis Science and Technology Collection |
subjects | Applied statistics Covariance Data analysis Data collection Determinants Estimates Estimating techniques expectation-maximization missing value multiple imputation outliers Outliers (statistics) Principal component analysis Principal components analysis Reduction Statistical methods Studies |
title | A comparison of different procedures for principal component analysis in the presence of outliers |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T22%3A51%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_infor&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20comparison%20of%20different%20procedures%20for%20principal%20component%20analysis%20in%20the%20presence%20of%20outliers&rft.jtitle=Journal%20of%20applied%20statistics&rft.au=Alkan,%20B.%20Bari%C5%9F&rft.date=2015-08-03&rft.volume=42&rft.issue=8&rft.spage=1716&rft.epage=1722&rft.pages=1716-1722&rft.issn=0266-4763&rft.eissn=1360-0532&rft_id=info:doi/10.1080/02664763.2015.1005063&rft_dat=%3Cproquest_infor%3E3710323831%3C/proquest_infor%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c441t-ff59a6577d817c789337693556352c352a5596d1b8ecf3a4c87ec90c429a866e3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1687115365&rft_id=info:pmid/&rfr_iscdi=true |