Loading…

An extension of correspondence analysis based on the multiple Taguchi’s index to evaluate the relationships between three categorical variables graphically: an application to the Italian football championship

The aim of this paper is to evaluate the relationships between three categorical variables, of which at least one is ordinal, from a graphical point of view and using also inferential tools. Three way Correspondence Analysis is a useful data science visualisation technique to find and display these...

Full description

Saved in:
Bibliographic Details
Published in:Annals of operations research 2023-06, Vol.325 (1), p.219-244
Main Authors: D’Ambra, Antonello, Amenta, Pietro
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The aim of this paper is to evaluate the relationships between three categorical variables, of which at least one is ordinal, from a graphical point of view and using also inferential tools. Three way Correspondence Analysis is a useful data science visualisation technique to find and display these relationships. This analysis, like the classical two-way analysis, cannot be applied in an efficient way in the presence of ordinal categorical variables because this characteristic is not taken directly into account by the Pearson’s chi-square contingency coefficient. Taguchi (Statistical analysis, Maruzen, Tokyo, 1966, Igaku 29:806–813, 1974) introduced a statistic that considers the ordinal nature of a categorical variable using the cumulative frequency of the cells of the contingency table across this variable. He introduced it as a simple alternative to Pearson’s statistic for ordered contingency tables. This index is also at the base of several Correspondence Analysis extensions that have been proposed in the literature. We have developed a multiple extension of Taguchi’s index. An enhancement of Correspondence Analysis has also been developed based on decomposition of this index. An orthogonal decomposition of this new index has been introduced to test the statistical significance of each aggregated column category. Moreover, a confidence region for each row and aggregated column category of the table has been developed. An application has been developed to highlight the easy applicability and graphical reading of the results of our approach. In this study, we evaluate the relationships between the ranking of the Italian football “Serie A” championship of the last 10 seasons and a set of two factors defined by average percentage of ball possession and number of tags for each team. This new approach may represent a useful guide for researchers who graphically analyse ranking data.
ISSN:0254-5330
1572-9338
DOI:10.1007/s10479-022-04803-3