Loading…
Application of time series discretization using evolutionary programming for classification of precancerous cervical lesions
[Display omitted] •We present a novel application of time series discretization.•Evolutionary programming is used to search for a good discretization scheme.•Temporal patterns observed in colposcopy are used as predictors of cervical cancer.•The discrete representation is as efficient as the raw dat...
Saved in:
Published in: | Journal of biomedical informatics 2014-06, Vol.49, p.73-83 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | [Display omitted]
•We present a novel application of time series discretization.•Evolutionary programming is used to search for a good discretization scheme.•Temporal patterns observed in colposcopy are used as predictors of cervical cancer.•The discrete representation is as efficient as the raw data for this application.
In this work, we present a novel application of time series discretization using evolutionary programming for the classification of precancerous cervical lesions. The approach optimizes the number of intervals in which the length and amplitude of the time series should be compressed, preserving the important information for classification purposes. Using evolutionary programming, the search for a good discretization scheme is guided by a cost function which considers three criteria: the entropy regarding the classification, the complexity measured as the number of different strings needed to represent the complete data set, and the compression rate assessed as the length of the discrete representation. This discretization approach is evaluated using a time series data based on temporal patterns observed during a classical test used in cervical cancer detection; the classification accuracy reached by our method is compared with the well-known times series discretization algorithm SAX and the dimensionality reduction method PCA. Statistical analysis of the classification accuracy shows that the discrete representation is as efficient as the complete raw representation for the present application, reducing the dimensionality of the time series length by 97%. This representation is also very competitive in terms of classification accuracy when compared with similar approaches. |
---|---|
ISSN: | 1532-0464 1532-0480 |
DOI: | 10.1016/j.jbi.2014.03.004 |