Loading…

Balancing sequential data to predict students at-risk using adversarial networks

•This study presents a novel adversarial-based approach to up-sample the sequential data in an educational setting.•The proposed method generates new student sequences such that the past behavior of each student is encapsulated in its next sequence.•The data from the Open University (UK) is transfor...

Full description

Saved in:
Bibliographic Details
Published in:Computers & electrical engineering 2021-07, Vol.93, p.107274, Article 107274
Main Authors: Waheed, Hajra, Anas, Muhammad, Hassan, Saeed-Ul, Aljohani, Naif Radi, Alelyani, Salem, Edifor, Ernest Edem, Nawaz, Raheel
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•This study presents a novel adversarial-based approach to up-sample the sequential data in an educational setting.•The proposed method generates new student sequences such that the past behavior of each student is encapsulated in its next sequence.•The data from the Open University (UK) is transformed into a sequential format and used as a case study to eliminate the imbalance in students' academic performances.•The proposed approach outperforms the conventional state-of-the-art Random Over-sampling and Synthetic Minority Over-sampling techniques with an improved AUC of 7.07% and 6.53%, respectively. Class imbalance is a challenging problem especially in a supervised learning setup, as most classification algorithms are designed for balanced class distributions. Although various up-sampling approaches exist for eliminating the class imbalance, however, they do not handle the complexities of sequential data. In this study, using the data of over 30,000 students from the Open University (UK), we implement a deep-learning-based approach using adversarial networks, Sequential Conditional Generative Adversarial Network (SC-GAN) that encapsulates the past behavior of each student for its previous sequences and generates synthetic student records for the next timestamp. The proposed approach is devised to generate instances, which are augmented with the actual data to eliminate class imbalance. A performance comparison of the proposed SC-GAN with the standard up-sampling methods is also presented and the results validate the proposed method with an improved AUC of 7.07% and 6.53%, respectively, when compared with conventional Random Over-sampling and Sythetic Minority Oversampling techniques. [Display omitted]
ISSN:0045-7906
1879-0755
DOI:10.1016/j.compeleceng.2021.107274