Loading…

ESL essay raters’ cognitive processes in applying the Jacobs et al. rubric: An eye-movement study

•We used eye tracking to measure raters’ attention to analytic rubric subcomponents.•Attention was associated with the essay raters’ inter-rater reliability estimates.•Raters who agreed the most had common attentional foci across the subcomponents.•Disagreeing raters read different parts of the rubr...

Full description

Saved in:

Bibliographic Details
Published in:	Assessing writing 2015-07, Vol.25, p.38-54
Main Authors:	Winke, Paula, Lim, Hyojung
Format:	Article
Language:	English
Subjects:	Analytic scoring Cognitive processing Essay rating Inter-rater reliability Rater effects Rubric use
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	•We used eye tracking to measure raters’ attention to analytic rubric subcomponents.•Attention was associated with the essay raters’ inter-rater reliability estimates.•Raters who agreed the most had common attentional foci across the subcomponents.•Disagreeing raters read different parts of the rubric to justify their scores.•We discuss rubric layout as an important factor in test-construct articulation. We investigated how nine trained raters used a popular five-component analytic rubric by Jacobs et al. (1981; reproduced in Weigle, 2002). We recorded the raters’ eye movements while they rated 40 English essays because cognition drives eye movement (Reichle, Warren, & McConnell, 2009): By inspecting to what raters attend (on a rubric), we gain insights into their thoughts. We estimated inter-rater-reliability for each subcomponent. Attention (measured as total eye-fixation duration and eye-visit count, with the number of words per subcomponent controlled) was associated with inter-rater reliability: Organization (the second category) received the most attention (slightly more than the first, content). Organization also had the highest inter-rater reliability (ICC coefficient=.92). Raters attended least to and agreed least on mechanics (the last category; ICC coefficient=.85). Raters who agreed the most had common attentional foci across the subcomponents. Disagreements were directly viewable through eye-movement-data heatmaps. We discuss the rubric in terms of primacy: raters paid the most attention to organization and content because they were on the left (and read first). We hypothesize what would happen if test developers were to remove the least-reliable (and right-most) subcomponent (mechanics). We discuss rubric design as an important factor in test-construct articulation.
ISSN:	1075-2935 1873-5916
DOI:	10.1016/j.asw.2015.05.002