Loading…

Utilizing programming traces to explore and model the dimensions of novices' code‐writing skill

Studies have found that most novice programmers have low proficiency in writing code. However, it is unclear what subskills compose code writing and which subskills novice programmers struggle with. This study utilizes programming traces to identify latent subskills that constitute code writing so t...

Full description

Saved in:
Bibliographic Details
Published in:Computer applications in engineering education 2023-07, Vol.31 (4), p.1041-1058
Main Authors: Zhang, Yingbin, Paquette, Luc, Pinto, Juan D., Fan, Aysa Xuemo
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Studies have found that most novice programmers have low proficiency in writing code. However, it is unclear what subskills compose code writing and which subskills novice programmers struggle with. This study utilizes programming traces to identify latent subskills that constitute code writing so that teachers can offer specific instruction on the weak subskills. Data were collected from an undergraduate course teaching introductory computer science in Java. Six hundred and fourteen students made submissions to homework programming questions in a web‐based learning system. Based on the submission traces, we computed 11 features related to correctness and time students spent on their submissions. We conducted an exploratory factor analysis on two‐thirds of students selected randomly and identified four factors. The first factor, code style proficiency, was mainly related to code style errors. The second, syntactic proficiency, concerned compiler errors. The third is semantic proficiency, which concerns runtime and logic errors. The fourth, syntactic debugging proficiency, concerned the success rate and time required for fixing compiler and code style errors. A confirmatory factor analysis conducted on the remaining one‐third of the data supported the four‐factor structure. The factor model showed measurement invariance between the data set where the model was developed and two new datasets, one from the same sample but collected at a different time point and another from a different sample and context (onsite course vs. online course). The factors were related to prior programming abilities, programming language familiarity, and future exam performance. These associations provided validity evidence for the factor model.
ISSN:1061-3773
1099-0542
DOI:10.1002/cae.22622