Loading…

Application of the Cluster Classification Data Mining Method to Child Illiteracy in Indonesia

The objective of this study is to cluster and classify data using a combination of the k-means and C4.5 methods. The process involves clustering and subsequent classification. The classification process uses k-folds = 10 and samples = stratified sampling. In this study, analphabets in Indonesia of a...

Full description

Saved in:
Bibliographic Details
Published in:Library philosophy and practice 2021-03, p.1-6
Main Authors: Arifin, Muhammad, Bhawika, Gita Widi, Habibi, Muazar M A, Firdaus, Winci, Agustinova, Danu Eko, Rahim, Robbi
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The objective of this study is to cluster and classify data using a combination of the k-means and C4.5 methods. The process involves clustering and subsequent classification. The classification process uses k-folds = 10 and samples = stratified sampling. In this study, analphabets in Indonesia of a minimum age of 15 years (15+) were evaluated. The data are the percentage of analogs between 2017 and 2019. The dataset was obtained from https://www.bps.go.id and is accessible at https://osf.io/crwug. In this study, the Davies Bouldin index (DBI) was used to determine the number of clusters with an optimal DBI value of k = 2, namely, 0,121. The results of the cluster maps in Indonesian territories demonstrate low clustering (C 0 = 22 provinces) and high clustering (C 1 = 11 provinces) for children with k = 2 analphabets. Then, the clustering results were classified, and an accuracy of 97.50 was realized, along with a recall of 90.91%, a precision of 100.00%, and an AUC (optimistic) of 0.95 (excellent classification).
ISSN:1522-0222