Loading…
H‐type indices with applications in chemometrics II: h‐outlyingness index
An outlier is generally considered as a data point that deviates from the “bulk” of all the data points. For outlier diagnosis, two questions could be asked: (1) How far is an object from the bulk? and (2) how many data points do the “bulk” include? To simultaneously deal with the above two question...
Saved in:
Published in: | Journal of chemometrics 2021-11, Vol.35 (11), p.n/a |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c2545-2ab14a8ba76c5947389fc7a76eec147d0a65005c8e0be334dd547cb65231712a3 |
container_end_page | n/a |
container_issue | 11 |
container_start_page | |
container_title | Journal of chemometrics |
container_volume | 35 |
creator | Yang, Qin Xu, Lu Tian, Guo‐Li Wu, Ben‐Qing |
description | An outlier is generally considered as a data point that deviates from the “bulk” of all the data points. For outlier diagnosis, two questions could be asked: (1) How far is an object from the bulk? and (2) how many data points do the “bulk” include? To simultaneously deal with the above two questions, the h‐outlyingness index (HOI) is defined as suppose a given data point in a data set of N data points, if at most M% of all the (N − 1) one‐to‐rest distances is no less than M% of all the N(N − 1)/2 pairwise distances, the HOI value for the given data point will be M%. For applications, HOI was used for outlier diagnosis in simulated and real data sets, and the results were compared with those obtained by some robust statistical methods. Compared with the traditional methods, HOI gained similar results. For high‐dimensional data, it was wise to compute HOI based on dimension reduction methods such as principal component analysis (PCA). HOI was demonstrated to be a simple, easy‐to‐compute, robust and effective index for outlier diagnosis. Moreover, HOI is a nonparametric method that has no underlying assumptions on data distribution, which will be useful in chemometrics for multivariate outlier diagnosis.
The h‐outlyingness index (HOI) is described to perform outlier detection. HOI is defined as suppose a given data point in a data set of N data points, if at most M% of all the (N − 1) one‐to‐rest distances is no less than M% of all the N(N − 1)/2 pairwise distances, the HOI value for the given data point will be M%. The investigation results demonstrate that HOI is a simple, nonparametric, robust, and effective index for outlier diagnosis in chemometrics. |
doi_str_mv | 10.1002/cem.3375 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2599262824</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2599262824</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2545-2ab14a8ba76c5947389fc7a76eec147d0a65005c8e0be334dd547cb65231712a3</originalsourceid><addsrcrecordid>eNp10M1KAzEQB_AgCtYq-AgLXrxszecm8Sal2oLFi4K3kM1Obcp-udlS9-Yj-Iw-ian16mkY5jcz8EfokuAJwZjeOKgmjElxhEYEa50Sql6P0QgrlaWaKXaKzkLYYBxnjI_Qcv79-dUPLSS-LryDkOx8v05s25be2d43dYiTxK2hairoO-9CsljcJuu41mz7cvD1Ww1hjwr4OEcnK1sGuPirY_RyP3ueztPHp4fF9O4xdVRwkVKbE25VbmXmhOaSKb1yMnYAjnBZYJsJjIVTgHNgjBeF4NLlmaCMSEItG6Orw922a963EHqzabZdHV8aKrSmGVWUR3V9UK5rQuhgZdrOV7YbDMFmH5aJYZl9WJGmB7rzJQz_OjOdLX_9D4-dbE0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2599262824</pqid></control><display><type>article</type><title>H‐type indices with applications in chemometrics II: h‐outlyingness index</title><source>Wiley-Blackwell Read & Publish Collection</source><creator>Yang, Qin ; Xu, Lu ; Tian, Guo‐Li ; Wu, Ben‐Qing</creator><creatorcontrib>Yang, Qin ; Xu, Lu ; Tian, Guo‐Li ; Wu, Ben‐Qing</creatorcontrib><description>An outlier is generally considered as a data point that deviates from the “bulk” of all the data points. For outlier diagnosis, two questions could be asked: (1) How far is an object from the bulk? and (2) how many data points do the “bulk” include? To simultaneously deal with the above two questions, the h‐outlyingness index (HOI) is defined as suppose a given data point in a data set of N data points, if at most M% of all the (N − 1) one‐to‐rest distances is no less than M% of all the N(N − 1)/2 pairwise distances, the HOI value for the given data point will be M%. For applications, HOI was used for outlier diagnosis in simulated and real data sets, and the results were compared with those obtained by some robust statistical methods. Compared with the traditional methods, HOI gained similar results. For high‐dimensional data, it was wise to compute HOI based on dimension reduction methods such as principal component analysis (PCA). HOI was demonstrated to be a simple, easy‐to‐compute, robust and effective index for outlier diagnosis. Moreover, HOI is a nonparametric method that has no underlying assumptions on data distribution, which will be useful in chemometrics for multivariate outlier diagnosis.
The h‐outlyingness index (HOI) is described to perform outlier detection. HOI is defined as suppose a given data point in a data set of N data points, if at most M% of all the (N − 1) one‐to‐rest distances is no less than M% of all the N(N − 1)/2 pairwise distances, the HOI value for the given data point will be M%. The investigation results demonstrate that HOI is a simple, nonparametric, robust, and effective index for outlier diagnosis in chemometrics.</description><identifier>ISSN: 0886-9383</identifier><identifier>EISSN: 1099-128X</identifier><identifier>DOI: 10.1002/cem.3375</identifier><language>eng</language><publisher>Chichester: Wiley Subscription Services, Inc</publisher><subject>Chemometrics ; Data points ; Datasets ; Diagnosis ; h‐index ; h‐outlyingness index (HOI) ; outlier diagnosis ; Outliers (statistics) ; Principal components analysis ; Questions ; robust statistics ; Robustness ; Statistical methods</subject><ispartof>Journal of chemometrics, 2021-11, Vol.35 (11), p.n/a</ispartof><rights>2021 John Wiley & Sons, Ltd.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c2545-2ab14a8ba76c5947389fc7a76eec147d0a65005c8e0be334dd547cb65231712a3</cites><orcidid>0000-0003-4742-5623</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Yang, Qin</creatorcontrib><creatorcontrib>Xu, Lu</creatorcontrib><creatorcontrib>Tian, Guo‐Li</creatorcontrib><creatorcontrib>Wu, Ben‐Qing</creatorcontrib><title>H‐type indices with applications in chemometrics II: h‐outlyingness index</title><title>Journal of chemometrics</title><description>An outlier is generally considered as a data point that deviates from the “bulk” of all the data points. For outlier diagnosis, two questions could be asked: (1) How far is an object from the bulk? and (2) how many data points do the “bulk” include? To simultaneously deal with the above two questions, the h‐outlyingness index (HOI) is defined as suppose a given data point in a data set of N data points, if at most M% of all the (N − 1) one‐to‐rest distances is no less than M% of all the N(N − 1)/2 pairwise distances, the HOI value for the given data point will be M%. For applications, HOI was used for outlier diagnosis in simulated and real data sets, and the results were compared with those obtained by some robust statistical methods. Compared with the traditional methods, HOI gained similar results. For high‐dimensional data, it was wise to compute HOI based on dimension reduction methods such as principal component analysis (PCA). HOI was demonstrated to be a simple, easy‐to‐compute, robust and effective index for outlier diagnosis. Moreover, HOI is a nonparametric method that has no underlying assumptions on data distribution, which will be useful in chemometrics for multivariate outlier diagnosis.
The h‐outlyingness index (HOI) is described to perform outlier detection. HOI is defined as suppose a given data point in a data set of N data points, if at most M% of all the (N − 1) one‐to‐rest distances is no less than M% of all the N(N − 1)/2 pairwise distances, the HOI value for the given data point will be M%. The investigation results demonstrate that HOI is a simple, nonparametric, robust, and effective index for outlier diagnosis in chemometrics.</description><subject>Chemometrics</subject><subject>Data points</subject><subject>Datasets</subject><subject>Diagnosis</subject><subject>h‐index</subject><subject>h‐outlyingness index (HOI)</subject><subject>outlier diagnosis</subject><subject>Outliers (statistics)</subject><subject>Principal components analysis</subject><subject>Questions</subject><subject>robust statistics</subject><subject>Robustness</subject><subject>Statistical methods</subject><issn>0886-9383</issn><issn>1099-128X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp10M1KAzEQB_AgCtYq-AgLXrxszecm8Sal2oLFi4K3kM1Obcp-udlS9-Yj-Iw-ian16mkY5jcz8EfokuAJwZjeOKgmjElxhEYEa50Sql6P0QgrlaWaKXaKzkLYYBxnjI_Qcv79-dUPLSS-LryDkOx8v05s25be2d43dYiTxK2hairoO-9CsljcJuu41mz7cvD1Ww1hjwr4OEcnK1sGuPirY_RyP3ueztPHp4fF9O4xdVRwkVKbE25VbmXmhOaSKb1yMnYAjnBZYJsJjIVTgHNgjBeF4NLlmaCMSEItG6Orw922a963EHqzabZdHV8aKrSmGVWUR3V9UK5rQuhgZdrOV7YbDMFmH5aJYZl9WJGmB7rzJQz_OjOdLX_9D4-dbE0</recordid><startdate>202111</startdate><enddate>202111</enddate><creator>Yang, Qin</creator><creator>Xu, Lu</creator><creator>Tian, Guo‐Li</creator><creator>Wu, Ben‐Qing</creator><general>Wiley Subscription Services, Inc</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7U5</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0003-4742-5623</orcidid></search><sort><creationdate>202111</creationdate><title>H‐type indices with applications in chemometrics II: h‐outlyingness index</title><author>Yang, Qin ; Xu, Lu ; Tian, Guo‐Li ; Wu, Ben‐Qing</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2545-2ab14a8ba76c5947389fc7a76eec147d0a65005c8e0be334dd547cb65231712a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Chemometrics</topic><topic>Data points</topic><topic>Datasets</topic><topic>Diagnosis</topic><topic>h‐index</topic><topic>h‐outlyingness index (HOI)</topic><topic>outlier diagnosis</topic><topic>Outliers (statistics)</topic><topic>Principal components analysis</topic><topic>Questions</topic><topic>robust statistics</topic><topic>Robustness</topic><topic>Statistical methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yang, Qin</creatorcontrib><creatorcontrib>Xu, Lu</creatorcontrib><creatorcontrib>Tian, Guo‐Li</creatorcontrib><creatorcontrib>Wu, Ben‐Qing</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Journal of chemometrics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yang, Qin</au><au>Xu, Lu</au><au>Tian, Guo‐Li</au><au>Wu, Ben‐Qing</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>H‐type indices with applications in chemometrics II: h‐outlyingness index</atitle><jtitle>Journal of chemometrics</jtitle><date>2021-11</date><risdate>2021</risdate><volume>35</volume><issue>11</issue><epage>n/a</epage><issn>0886-9383</issn><eissn>1099-128X</eissn><abstract>An outlier is generally considered as a data point that deviates from the “bulk” of all the data points. For outlier diagnosis, two questions could be asked: (1) How far is an object from the bulk? and (2) how many data points do the “bulk” include? To simultaneously deal with the above two questions, the h‐outlyingness index (HOI) is defined as suppose a given data point in a data set of N data points, if at most M% of all the (N − 1) one‐to‐rest distances is no less than M% of all the N(N − 1)/2 pairwise distances, the HOI value for the given data point will be M%. For applications, HOI was used for outlier diagnosis in simulated and real data sets, and the results were compared with those obtained by some robust statistical methods. Compared with the traditional methods, HOI gained similar results. For high‐dimensional data, it was wise to compute HOI based on dimension reduction methods such as principal component analysis (PCA). HOI was demonstrated to be a simple, easy‐to‐compute, robust and effective index for outlier diagnosis. Moreover, HOI is a nonparametric method that has no underlying assumptions on data distribution, which will be useful in chemometrics for multivariate outlier diagnosis.
The h‐outlyingness index (HOI) is described to perform outlier detection. HOI is defined as suppose a given data point in a data set of N data points, if at most M% of all the (N − 1) one‐to‐rest distances is no less than M% of all the N(N − 1)/2 pairwise distances, the HOI value for the given data point will be M%. The investigation results demonstrate that HOI is a simple, nonparametric, robust, and effective index for outlier diagnosis in chemometrics.</abstract><cop>Chichester</cop><pub>Wiley Subscription Services, Inc</pub><doi>10.1002/cem.3375</doi><tpages>9</tpages><orcidid>https://orcid.org/0000-0003-4742-5623</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0886-9383 |
ispartof | Journal of chemometrics, 2021-11, Vol.35 (11), p.n/a |
issn | 0886-9383 1099-128X |
language | eng |
recordid | cdi_proquest_journals_2599262824 |
source | Wiley-Blackwell Read & Publish Collection |
subjects | Chemometrics Data points Datasets Diagnosis h‐index h‐outlyingness index (HOI) outlier diagnosis Outliers (statistics) Principal components analysis Questions robust statistics Robustness Statistical methods |
title | H‐type indices with applications in chemometrics II: h‐outlyingness index |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T15%3A01%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=H%E2%80%90type%20indices%20with%20applications%20in%20chemometrics%20II:%20h%E2%80%90outlyingness%20index&rft.jtitle=Journal%20of%20chemometrics&rft.au=Yang,%20Qin&rft.date=2021-11&rft.volume=35&rft.issue=11&rft.epage=n/a&rft.issn=0886-9383&rft.eissn=1099-128X&rft_id=info:doi/10.1002/cem.3375&rft_dat=%3Cproquest_cross%3E2599262824%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c2545-2ab14a8ba76c5947389fc7a76eec147d0a65005c8e0be334dd547cb65231712a3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2599262824&rft_id=info:pmid/&rfr_iscdi=true |