Loading…

Automated essay scoring and the future of educational assessment in medical education

Context Constructed‐response tasks, which range from short‐answer tests to essay questions, are included in assessments of medical knowledge because they allow educators to measure students’ ability to think, reason, solve complex problems, communicate and collaborate through their use of writing. H...

Full description

Saved in:
Bibliographic Details
Published in:Medical education 2014-10, Vol.48 (10), p.950-962
Main Authors: Gierl, Mark J, Latifi, Syed, Lai, Hollis, Boulais, André-Philippe, De Champlain, André
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Context Constructed‐response tasks, which range from short‐answer tests to essay questions, are included in assessments of medical knowledge because they allow educators to measure students’ ability to think, reason, solve complex problems, communicate and collaborate through their use of writing. However, constructed‐response tasks are also costly to administer and challenging to score because they rely on human raters. One alternative to the manual scoring process is to integrate computer technology with writing assessment. The process of scoring written responses using computer programs is known as ‘automated essay scoring’ (AES). Methods An AES system uses a computer program that builds a scoring model by extracting linguistic features from a constructed‐response prompt that has been pre‐scored by human raters and then, using machine learning algorithms, maps the linguistic features to the human scores so that the computer can be used to classify (i.e. score or grade) the responses of a new group of students. The accuracy of the score classification can be evaluated using different measures of agreement. Results Automated essay scoring provides a method for scoring constructed‐response tests that complements the current use of selected‐response testing in medical education. The method can serve medical educators by providing the summative scores required for high‐stakes testing. It can also serve medical students by providing them with detailed feedback as part of a formative assessment process. Conclusions Automated essay scoring systems yield scores that consistently agree with those of human raters at a level as high, if not higher, as the level of agreement among human raters themselves. The system offers medical educators many benefits for scoring constructed‐response tasks, such as improving the consistency of scoring, reducing the time required for scoring and reporting, minimising the costs of scoring, and providing students with immediate feedback on constructed‐response tasks. Discuss ideas arising from the article at www.mededuc.com ‘discuss’
ISSN:0308-0110
1365-2923
DOI:10.1111/medu.12517