Comparing Examination Standards without Graded Candidate Scripts
Comparative judgement methods are commonly used to explore standards in examination papers over time. However, studies are limited by a paucity of gra.
- Pub. date: December 15, 2022
- Pages: 79-90
- 263 Downloads
- 544 Views
- 0 Citations
Comparative judgement methods are commonly used to explore standards in examination papers over time. However, studies are limited by a paucity of graded candidate scripts from previous years, as well as the expense and time required to standardise scripts. We present three studies that attempted, without the use of graded candidate scripts, to replicate and extend previous results about standards in mathematics examination papers. We found that re-typesetting examination papers into a consistent format was necessary, but that comparative judgement of examination papers without an archive of graded candidate scripts offered a reliable and efficient method for revealing relative demand over time. Our approach enables standards comparison where previously this was not possible. We found a reasonable correlation between judgments of actual student scripts and judgments of the items only, meaning that conclusions may be drawn about the demand of examination papers even when graded candidate scripts are not available.
Keywords: Comparative judgement, comparing demand, mathematics, student scripts, re-typesetting.
0
References
Bradley, R. A., & Terry, M. E. (1952). Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika, 39(3/4), 324–345. https://doi.org/10.2307/2334029
Bramley, T. (2007). Paired comparison methods. In P. Newton, J.-A. Baird, H. Goldstein, H. Patrick, & P. Tymms (Eds), Techniques for monitoring the comparability of examination standards (pp. 264–294). QCA. https://bit.ly/3vBiUmA
Bramley, T., & Gill, T. (2010). Evaluating the rank-ordering method for standard maintaining. Research Papers in Education, 25(3), 293–317. https://doi.org/10.1080/02671522.2010.498147
Bramley, T., Bell, J., & Pollitt, A. (1998). Assessing changes in standards over time using Thurstone paired comparisons. Education Research and Perspectives, 25(2), 1–24. https://bit.ly/3SlfgqH
Brumberger, E. R. (2003). The rhetoric of typography: the persona of typeface and text. Technical Communication, 50(2), 206–223. https://bit.ly/3ztXoBf
Coe, R. (2007). Changes in standards at GCSE and A-level: Evidence from ALIS and YELLIS. Durham, Centre for Curriculum, Evaluation and Management, Durham University. https://bit.ly/3JpTTAu
Croft, A. C., Harrison, M. C. & Robinson, C. L. (2009). Recruitment and retention of students: an integrated and holistic vision of mathematics support. International Journal of Mathematical Education in Science and Technology, 40(1), 109–125. https://doi.org/10.1080/00207390802542395
Crisp, V., Johnson, M., & Novaković, N. (2012). The effects of features of examination questions on the performance of students with dyslexia. British Educational Research Journal, 38(5), 813–839. https://doi.org/10.1080/01411926.2011.584964
Davies, B., Alcock, L., & Jones, I. (2021). What do mathematicians mean by proof? A comparative-judgement study of students’ and mathematicians’ views. The Journal of Mathematical Behavior, 61, 1-10. https://doi.org/10.1016/j.jmathb.2020.100824
Goldstein, H. (1979). Changing educational standards: a fruitless search. Journal of the National Association of Inspectors and Educational Advisers, 11, 18–19.
Haenschen, K., & Tamul, D. J. (2020). What’s in a font?: Ideological perceptions of typography. Communication Studies, 71(2), 244–261. https://doi.org/10.1080/10510974.2019.1692884
Henderson, P. W., Giese, J. L., & Cote, J. A. (2004). Impression management using typeface design. Journal of Marketing, 68(4), 60–72. https://doi.org/10.1509/jmkg.68.4.60.42736
Hodgen, J., Adkins, M., & Tomei, A. (2020). The mathematical backgrounds of undergraduates from England. Teaching Mathematics and its Applications. 39(1), 38–60. https://doi.org/10.1093/teamat/hry017
Holmes, S., He, Q., & Meadows, M. (2017). An investigation of construct relevant and irrelevant features of mathematics problem-solving questions using comparative judgement and Kelly’s repertory grid. Research in Mathematics Education, 19(2), 112–129. https://doi.org/10.1080/14794802.2017.1334576
Jones, I., & Sirl, D. (2017). Peer assessment of mathematical understanding using comparative judgement. Nordic Studies in Mathematics Education, 22(4), 147–164.
Jones, I., Swan, M., & Pollitt, A. (2014). Assessing mathematical problem solving using comparative judgement. International Journal of Science and Mathematics Education, 13(1), 151–177. https://doi.org/10.1007/s10763-013-9497-6
Jones, I., Wheadon, C., Humphries, S., & Inglis, M. (2016). Fifty years of A-Level Mathematics: Have standards changed? British Educational Research Journal, 42(4), 543–560. https://doi.org/10.1002/berj.3224
Krivec, T., Košak Babuder, M., Godec, P., Weingerl, P., & Stankovič Elesini, U. (2020). Impact of digital text variables on legibility for persons with dyslexia. Dyslexia, 26(1), 87–103. https://doi.org/10.1002/dys.1646
Lawson, D. (2003). Changes in student entry competencies 1991–2001. Teaching Mathematics and its Applications, 22(4), 171–175. https://doi.org/10.1093/teamat/22.4.171
Newton, P. (1997). Examining standards over time. Research Papers in Education, 12(3), 227–247. https://doi.org/10.1080/0267152970120302
Noyes, A. & Dalby, D. (2020). Mathematics in England’s Further Education Colleges: an analysis of policy enactment and practice. The Mathematics in Further Education Colleges Project: Interim report 2. The University of Nottingham and The Nuffield Foundation. https://bit.ly/3oP5EqA
Ofqual. (2015). A comparison of expected difficulty, actual difficulty and assessment of problem solving across GCSE Maths sample assessment materials (No. Ofqual/15/5679). Ofqual.
Ofqual. (2017). GCSE Science: An Evaluation of the Expected Difficulty of Items (No. Ofqual/17/6163). Ofqual.
Pollitt, A. (2012). The method of adaptive comparative judgement. Assessment in Education: Principles, Policy & Practice, 19(3), 281–300. https://doi.org/10.1080/0969594X.2012.665354
Robinson, C. (2007). Awarding examination grades: Current processes and their evolution. In P. Newton, J-A Baird, H. Goldstein, H. Patrick & P. Tymms (Eds.), Techniques for monitoring the comparability of examination standards (pp. 97–123). QCA. https://bit.ly/3d42hcP
Tarricone, P., & Newhouse, C. P. (2017). An investigation of the reliability of using comparative judgment to score creative products. Educational Assessment, 22(4), 220–230. https://doi.org/10.1080/10627197.2017.1381553
Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 34(4), 273–286. https://doi.org/10.1037/h0070288
Verhavert, S., Bouwer, R., Donche, V., & Maeyer, S. D. (2019). A meta-analysis on the reliability of comparative judgement. Assessment in Education: Principles, Policy & Practice, 26(5), 1–22. https://doi.org/10.1080/0969594X.2019.1602027
Verhavert, S., De Maeyer, S., Donche, V., & Coertjens, L. (2018). Scale separation reliability: What does it mean in the context of comparative judgment? Applied Psychological Measurement, 42(6), 428–445. https://doi.org/10.1177/0146621617748321
Wheadon, C., Barmby, P., Christodoulou, D., & Henderson, B. (2020). A comparative judgement approach to the large-scale assessment of primary writing in England. Assessment in Education: Principles, Policy & Practice, 27(1), 46–64. https://doi.org/10.1080/0969594X.2019.1700212