Loading…
Improving the validity of script concordance testing by optimising and balancing items
Background A script concordance test (SCT) is a modality for assessing clinical reasoning. Concerns had been raised about the plausible validity threat to SCT scores if students deliberately avoided the extreme answer options to obtain higher scores. The aims of the study were firstly to investigate...
Saved in:
Published in: | Medical education 2018-03, Vol.52 (3), p.336-346 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Background
A script concordance test (SCT) is a modality for assessing clinical reasoning. Concerns had been raised about the plausible validity threat to SCT scores if students deliberately avoided the extreme answer options to obtain higher scores. The aims of the study were firstly to investigate whether students’ avoidance of the extreme answer options could result in higher scores, and secondly to determine whether a ‘balanced approach’ by careful construction of SCT items (to include extreme as well as median options as model responses) would improve the validity of an SCT.
Methods
Using the paired sample t‐test, the actual average student scores for 10 SCT papers from 2012–2016 were compared with simulated scores. The latter were generated by recoding all ‘−2’ responses to ‘−1’ and ‘+2’ responses to ‘+1’ for the whole and bottom 10% of the cohort (simulation 1), and scoring as if all students had chosen ‘0’ for their responses (simulation 2). The actual average and simulated average scores in 2012 (before the ‘balanced approach’) were compared with those from 2013–2016, when papers had a good balance of modal responses from the expert reference panel.
Results
In 2012, a score increase was seen in simulation 1 in the third‐year cohort, from 50.2 to 55.6% (t [10] = 4.818; p = 0.001). Since 2013, with the ‘balanced approach’, the actual SCT scores (57.4%) were significantly higher than scores in both simulation 1 and simulation 2 (46.7% and 23.9% respectively).
Conclusions
When constructing SCT examinations, apart from the rigorous pre‐examination optimisation, it is desirable to achieve a balance between items that attract extreme responses and those that attract median response options. This could mitigate the validity threat to SCT scores, especially for the low‐performing students who have previously been shown to only select median responses and avoid the extreme responses.
The authors demonstrate that SCT score validity could be enhanced that have roughly equal distribution of answer options in expert reference panel's modal responses |
---|---|
ISSN: | 0308-0110 1365-2923 |
DOI: | 10.1111/medu.13495 |