Loading…

ON CLASSIFICATION OF A BIVARIATE BINARY OBSERVATION

Classification of a bivariate binary observation into one of the two possible groups requires the estimation of the joint cell probabilities under each of the two groups. Two widely used approaches for the estimation of such joint cell probabilities are: [1] kernel based non-parametric approach; and...

Full description

Saved in:
Bibliographic Details
Published in:Communications in statistics. Theory and methods 2001-01, Vol.30 (11), p.2259-2279
Main Authors: Sutradhar, Santosh C., Sutradhar, Brajendra
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Classification of a bivariate binary observation into one of the two possible groups requires the estimation of the joint cell probabilities under each of the two groups. Two widely used approaches for the estimation of such joint cell probabilities are: [1] kernel based non-parametric approach; and [2] multinomial distribution based cell counts approach. In these traditional approaches, the joint cell probabilities are estimated without making any assumptions about the structural forms for these probabilities. Consequently, it is not clear, how these traditional approaches take into account the correlation that may exist between the 2-dimensional binary observations. In this paper, we model the cell probabilities by a suitable bivariate binary distribution which accommodates the correlation in a natural way, and examine the effect of this type of modelling in classifying a new correlated binary observation into one of the two groups. This is done by comparing the probability of misclassification yielded by the proposed model based approach with those of the kernel as well as multinomial distribution based approaches. It is shown through a simulation study that the probabilities of misclassification for the model based approach are substantially smaller than those of the other two approaches. We illustrate the use of the proposed model based approach in classification by analyzing a combined data from two epidemiological surveys of 6-11 year old children conducted in Connecticut, the New Haven Child Survey (NHCS) and the Eastern Connecticut Child Survey (ECCS).
ISSN:0361-0926
1532-415X
DOI:10.1081/STA-100107684