Loading…

How Robust Are Cross-Country Comparisons of PISA Scores to the Scaling Model Used?

The Programme for International Student Assessment (PISA) is an important international study of 15-olds' knowledge and skills. New results are released every 3 years, and have a substantial impact upon education policy. Yet, despite its influence, the methodology underpinning PISA has received...

Full description

Saved in:
Bibliographic Details
Published in:Educational measurement, issues and practice issues and practice, 2018-12, Vol.37 (4), p.28-39
Main Authors: Jerrim, John, Parker, Philip, Choi, Alvaro, Chmielewski, Anna Katyn, Sälzer, Christine, Shure, Nikki
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c4601-efeb345a1546d60b310410863a2bb084926e23886cdbb01c6a14e4d9f82c9513
cites cdi_FETCH-LOGICAL-c4601-efeb345a1546d60b310410863a2bb084926e23886cdbb01c6a14e4d9f82c9513
container_end_page 39
container_issue 4
container_start_page 28
container_title Educational measurement, issues and practice
container_volume 37
creator Jerrim, John
Parker, Philip
Choi, Alvaro
Chmielewski, Anna Katyn
Sälzer, Christine
Shure, Nikki
description The Programme for International Student Assessment (PISA) is an important international study of 15-olds' knowledge and skills. New results are released every 3 years, and have a substantial impact upon education policy. Yet, despite its influence, the methodology underpinning PISA has received significant criticism. Much of this criticism has focused upon the psychometric scaling model used to create the proficiency scores. The aim of this article is to therefore investigate the robustness of cross-country comparisons of PISA scores to subtle changes to the underlying scaling model used. This includes the specification of the item-response model, whether the difficulty and discrimination of items are allowed to vary across countries (item-by-country interactions) and how test questions not reached by pupils are treated. Our key finding is that these technical choices make little substantive difference to the overall country-level results. [Author abstract]
doi_str_mv 10.1111/emip.12211
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1111_emip_12211</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ericid>EJ1200968</ericid><informt_id>10.3316/aeipt.222774</informt_id><sourcerecordid>2160690060</sourcerecordid><originalsourceid>FETCH-LOGICAL-c4601-efeb345a1546d60b310410863a2bb084926e23886cdbb01c6a14e4d9f82c9513</originalsourceid><addsrcrecordid>eNp9kM1r3DAUxEVoIdu0l9wLgt4KTvUkWbZPZWs2zZaEhnychSw_Jwq7livJhP3va-9u21sFQoz0m9FjCDkHdgHT-oJbN1wA5wAnZAGFzDNRVfwNWbBCQDZfnJJ3Mb4wBrmqigW5u_Kv9M43Y0x0GZDWwceY1X7sU9jR2m8HE1z0faS-o7fr-yW9tz5gpMnT9IyTMhvXP9Eb3-KGPkZsv74nbzuzifjheJ6Rh8vVQ32VXf_8vq6X15mVikGGHTZC5gZyqVrFGgFMAiuVMLxpWCkrrpCLslS2nTRYZUCibKuu5LbKQZwROMTaOFod0GKwJmlv3D8xb84KrkXOhSonz6eDZwj-14gx6Rc_hn6aUnNQTFWMKTZRn4_JcxkBOz0EtzVhp4HpuWY916z3NU_wxwOMwdm_4OoHcMaq_ZfHMV_dBnf_SdKrm_Xtn8xvB0_YuqQNuiHp55SGqFuTjHZ95_cvPjzp1rs5TAhQR5JzXhRS_Aans54H</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2160690060</pqid></control><display><type>article</type><title>How Robust Are Cross-Country Comparisons of PISA Scores to the Scaling Model Used?</title><source>Wiley-Blackwell Read &amp; Publish Collection</source><source>ERIC</source><creator>Jerrim, John ; Parker, Philip ; Choi, Alvaro ; Chmielewski, Anna Katyn ; Sälzer, Christine ; Shure, Nikki</creator><creatorcontrib>Jerrim, John ; Parker, Philip ; Choi, Alvaro ; Chmielewski, Anna Katyn ; Sälzer, Christine ; Shure, Nikki</creatorcontrib><description>The Programme for International Student Assessment (PISA) is an important international study of 15-olds' knowledge and skills. New results are released every 3 years, and have a substantial impact upon education policy. Yet, despite its influence, the methodology underpinning PISA has received significant criticism. Much of this criticism has focused upon the psychometric scaling model used to create the proficiency scores. The aim of this article is to therefore investigate the robustness of cross-country comparisons of PISA scores to subtle changes to the underlying scaling model used. This includes the specification of the item-response model, whether the difficulty and discrimination of items are allowed to vary across countries (item-by-country interactions) and how test questions not reached by pupils are treated. Our key finding is that these technical choices make little substantive difference to the overall country-level results. [Author abstract]</description><identifier>ISSN: 0731-1745</identifier><identifier>EISSN: 1745-3992</identifier><identifier>DOI: 10.1111/emip.12211</identifier><language>eng</language><publisher>Washington: Wiley-Blackwell</publisher><subject>Academic achievement ; Achievement Tests ; Avaluació educativa ; Comparative Education ; Comparative method ; Education policy ; Educational Assessment ; Educational evaluation ; Educational tests &amp; measurements ; Foreign Countries ; International Assessment ; International comparisons ; International education ; Item Response Theory ; Large scale assessment ; large‐scale international assessments ; Mètode comparatiu ; PISA ; Programme for International Student Assessment (PISA) ; Psychometrics ; Rendiment acadèmic ; Robustness (Statistics) ; Scaling ; Scores ; Secondary School Students ; Student assessment ; Test Items</subject><ispartof>Educational measurement, issues and practice, 2018-12, Vol.37 (4), p.28-39</ispartof><rights>2018 by the National Council on Measurement in Education</rights><rights>(c) National Council on Measurement in Education, 2018 info:eu-repo/semantics/openAccess</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c4601-efeb345a1546d60b310410863a2bb084926e23886cdbb01c6a14e4d9f82c9513</citedby><cites>FETCH-LOGICAL-c4601-efeb345a1546d60b310410863a2bb084926e23886cdbb01c6a14e4d9f82c9513</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,777,781,882,27905,27906</link.rule.ids><backlink>$$Uhttp://eric.ed.gov/ERICWebPortal/detail?accno=EJ1200968$$DView record in ERIC$$Hfree_for_read</backlink></links><search><creatorcontrib>Jerrim, John</creatorcontrib><creatorcontrib>Parker, Philip</creatorcontrib><creatorcontrib>Choi, Alvaro</creatorcontrib><creatorcontrib>Chmielewski, Anna Katyn</creatorcontrib><creatorcontrib>Sälzer, Christine</creatorcontrib><creatorcontrib>Shure, Nikki</creatorcontrib><title>How Robust Are Cross-Country Comparisons of PISA Scores to the Scaling Model Used?</title><title>Educational measurement, issues and practice</title><description>The Programme for International Student Assessment (PISA) is an important international study of 15-olds' knowledge and skills. New results are released every 3 years, and have a substantial impact upon education policy. Yet, despite its influence, the methodology underpinning PISA has received significant criticism. Much of this criticism has focused upon the psychometric scaling model used to create the proficiency scores. The aim of this article is to therefore investigate the robustness of cross-country comparisons of PISA scores to subtle changes to the underlying scaling model used. This includes the specification of the item-response model, whether the difficulty and discrimination of items are allowed to vary across countries (item-by-country interactions) and how test questions not reached by pupils are treated. Our key finding is that these technical choices make little substantive difference to the overall country-level results. [Author abstract]</description><subject>Academic achievement</subject><subject>Achievement Tests</subject><subject>Avaluació educativa</subject><subject>Comparative Education</subject><subject>Comparative method</subject><subject>Education policy</subject><subject>Educational Assessment</subject><subject>Educational evaluation</subject><subject>Educational tests &amp; measurements</subject><subject>Foreign Countries</subject><subject>International Assessment</subject><subject>International comparisons</subject><subject>International education</subject><subject>Item Response Theory</subject><subject>Large scale assessment</subject><subject>large‐scale international assessments</subject><subject>Mètode comparatiu</subject><subject>PISA</subject><subject>Programme for International Student Assessment (PISA)</subject><subject>Psychometrics</subject><subject>Rendiment acadèmic</subject><subject>Robustness (Statistics)</subject><subject>Scaling</subject><subject>Scores</subject><subject>Secondary School Students</subject><subject>Student assessment</subject><subject>Test Items</subject><issn>0731-1745</issn><issn>1745-3992</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>7SW</sourceid><recordid>eNp9kM1r3DAUxEVoIdu0l9wLgt4KTvUkWbZPZWs2zZaEhnychSw_Jwq7livJhP3va-9u21sFQoz0m9FjCDkHdgHT-oJbN1wA5wAnZAGFzDNRVfwNWbBCQDZfnJJ3Mb4wBrmqigW5u_Kv9M43Y0x0GZDWwceY1X7sU9jR2m8HE1z0faS-o7fr-yW9tz5gpMnT9IyTMhvXP9Eb3-KGPkZsv74nbzuzifjheJ6Rh8vVQ32VXf_8vq6X15mVikGGHTZC5gZyqVrFGgFMAiuVMLxpWCkrrpCLslS2nTRYZUCibKuu5LbKQZwROMTaOFod0GKwJmlv3D8xb84KrkXOhSonz6eDZwj-14gx6Rc_hn6aUnNQTFWMKTZRn4_JcxkBOz0EtzVhp4HpuWY916z3NU_wxwOMwdm_4OoHcMaq_ZfHMV_dBnf_SdKrm_Xtn8xvB0_YuqQNuiHp55SGqFuTjHZ95_cvPjzp1rs5TAhQR5JzXhRS_Aans54H</recordid><startdate>20181201</startdate><enddate>20181201</enddate><creator>Jerrim, John</creator><creator>Parker, Philip</creator><creator>Choi, Alvaro</creator><creator>Chmielewski, Anna Katyn</creator><creator>Sälzer, Christine</creator><creator>Shure, Nikki</creator><general>Wiley-Blackwell</general><general>Wiley Subscription Services, Inc</general><general>John Wiley &amp; Sons</general><scope>7SW</scope><scope>BJH</scope><scope>BNH</scope><scope>BNI</scope><scope>BNJ</scope><scope>BNO</scope><scope>ERI</scope><scope>PET</scope><scope>REK</scope><scope>WWN</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>XX2</scope></search><sort><creationdate>20181201</creationdate><title>How Robust Are Cross-Country Comparisons of PISA Scores to the Scaling Model Used?</title><author>Jerrim, John ; Parker, Philip ; Choi, Alvaro ; Chmielewski, Anna Katyn ; Sälzer, Christine ; Shure, Nikki</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c4601-efeb345a1546d60b310410863a2bb084926e23886cdbb01c6a14e4d9f82c9513</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Academic achievement</topic><topic>Achievement Tests</topic><topic>Avaluació educativa</topic><topic>Comparative Education</topic><topic>Comparative method</topic><topic>Education policy</topic><topic>Educational Assessment</topic><topic>Educational evaluation</topic><topic>Educational tests &amp; measurements</topic><topic>Foreign Countries</topic><topic>International Assessment</topic><topic>International comparisons</topic><topic>International education</topic><topic>Item Response Theory</topic><topic>Large scale assessment</topic><topic>large‐scale international assessments</topic><topic>Mètode comparatiu</topic><topic>PISA</topic><topic>Programme for International Student Assessment (PISA)</topic><topic>Psychometrics</topic><topic>Rendiment acadèmic</topic><topic>Robustness (Statistics)</topic><topic>Scaling</topic><topic>Scores</topic><topic>Secondary School Students</topic><topic>Student assessment</topic><topic>Test Items</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jerrim, John</creatorcontrib><creatorcontrib>Parker, Philip</creatorcontrib><creatorcontrib>Choi, Alvaro</creatorcontrib><creatorcontrib>Chmielewski, Anna Katyn</creatorcontrib><creatorcontrib>Sälzer, Christine</creatorcontrib><creatorcontrib>Shure, Nikki</creatorcontrib><collection>ERIC</collection><collection>ERIC (Ovid)</collection><collection>ERIC</collection><collection>ERIC</collection><collection>ERIC (Legacy Platform)</collection><collection>ERIC( SilverPlatter )</collection><collection>ERIC</collection><collection>ERIC PlusText (Legacy Platform)</collection><collection>Education Resources Information Center (ERIC)</collection><collection>ERIC</collection><collection>CrossRef</collection><collection>Recercat</collection><jtitle>Educational measurement, issues and practice</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jerrim, John</au><au>Parker, Philip</au><au>Choi, Alvaro</au><au>Chmielewski, Anna Katyn</au><au>Sälzer, Christine</au><au>Shure, Nikki</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><ericid>EJ1200968</ericid><atitle>How Robust Are Cross-Country Comparisons of PISA Scores to the Scaling Model Used?</atitle><jtitle>Educational measurement, issues and practice</jtitle><date>2018-12-01</date><risdate>2018</risdate><volume>37</volume><issue>4</issue><spage>28</spage><epage>39</epage><pages>28-39</pages><issn>0731-1745</issn><eissn>1745-3992</eissn><abstract>The Programme for International Student Assessment (PISA) is an important international study of 15-olds' knowledge and skills. New results are released every 3 years, and have a substantial impact upon education policy. Yet, despite its influence, the methodology underpinning PISA has received significant criticism. Much of this criticism has focused upon the psychometric scaling model used to create the proficiency scores. The aim of this article is to therefore investigate the robustness of cross-country comparisons of PISA scores to subtle changes to the underlying scaling model used. This includes the specification of the item-response model, whether the difficulty and discrimination of items are allowed to vary across countries (item-by-country interactions) and how test questions not reached by pupils are treated. Our key finding is that these technical choices make little substantive difference to the overall country-level results. [Author abstract]</abstract><cop>Washington</cop><pub>Wiley-Blackwell</pub><doi>10.1111/emip.12211</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0731-1745
ispartof Educational measurement, issues and practice, 2018-12, Vol.37 (4), p.28-39
issn 0731-1745
1745-3992
language eng
recordid cdi_crossref_primary_10_1111_emip_12211
source Wiley-Blackwell Read & Publish Collection; ERIC
subjects Academic achievement
Achievement Tests
Avaluació educativa
Comparative Education
Comparative method
Education policy
Educational Assessment
Educational evaluation
Educational tests & measurements
Foreign Countries
International Assessment
International comparisons
International education
Item Response Theory
Large scale assessment
large‐scale international assessments
Mètode comparatiu
PISA
Programme for International Student Assessment (PISA)
Psychometrics
Rendiment acadèmic
Robustness (Statistics)
Scaling
Scores
Secondary School Students
Student assessment
Test Items
title How Robust Are Cross-Country Comparisons of PISA Scores to the Scaling Model Used?
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T18%3A34%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=How%20Robust%20Are%20Cross-Country%20Comparisons%20of%20PISA%20Scores%20to%20the%20Scaling%20Model%20Used?&rft.jtitle=Educational%20measurement,%20issues%20and%20practice&rft.au=Jerrim,%20John&rft.date=2018-12-01&rft.volume=37&rft.issue=4&rft.spage=28&rft.epage=39&rft.pages=28-39&rft.issn=0731-1745&rft.eissn=1745-3992&rft_id=info:doi/10.1111/emip.12211&rft_dat=%3Cproquest_cross%3E2160690060%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c4601-efeb345a1546d60b310410863a2bb084926e23886cdbb01c6a14e4d9f82c9513%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2160690060&rft_id=info:pmid/&rft_ericid=EJ1200968&rft_informt_id=10.3316/aeipt.222774&rfr_iscdi=true