Måling av sensorreliabilitet ved vurdering av norskprøve i skriftlig framstilling

Tor Midtbø; Arne Rossow; Brikt Sagbakken

doi:10.5617/adno.6358

Måling av sensorreliabilitet ved vurdering av norskprøve i skriftlig framstilling

Authors

Tor Midtbø
Arne Rossow
Brikt Sagbakken

DOI:

https://doi.org/10.5617/adno.6358

Keywords:

norskprøve, skriftlig vurdering, reliabilitet, inter-sensorreliabilitet, intra-sensorreliabilitet, Many-Facet Rasch Measurement

Abstract

Sensorer vurderer skriftlige tekster ulikt, og menneskelig sensur er en utfordring for prøvers reliabilitet. Dette er en utfordring som Kompetanse Norge må ta høyde for i arbeidet med å utvikle og kvalitetssikre Norskprøven for voksne innvandrere. Denne artikkelen redegjør for hvordan den statistiske modellen Many-Facets Rasch Measurement (MFRM) er brukt til å undersøke sensorkorpsets reliabilitet ved sensurering av Norskprøvens delprøve i skriftlig framstilling for desemberavviklingen 2017. MFRM-modellen gir oss informasjon om hvor streng og pålitelig hver sensor er i vurderingen av kandidatbesvarelser. Analysen viser at det er klare forskjeller i strenghet innad i sensorkorpset, og at kandidatens endelige resultat kan være påvirket av hvilke sensorer som vurderer besvarelsen. Samtidig finner vi at de fleste av de 77 sensorene sensurerer stabilt og pålitelig, som vil si at de har høy intra-sensorreliabilitet. Dette viser at sensorkorpset i stor grad oppfyller målsetningen om sensorer som uavhengige eksperter med konsekvent vurderingsadferd. Avslutningsvis diskuteres utfordringene knyttet til begrensninger ved prøvens utforming for analyse av sensorreliabilitet. I lys av diskusjonen vurderer vi MFRM sin rolle og egnethet, og peker på noen utviklingsområder.

Nøkkelord: norskprøve, skriftlig vurdering, reliabilitet, inter-sensorreliabilitet, intra-sensorreliabilitet, Many-Facet Rasch Measurement

Norwegian language test - Measuring rater reliability in the assessment of written presentation

Abstract
Raters assess written texts differently, and rater-mediated assessment is a challenge for test reliability. This is something Skills Norway has to take into consideration as test developer of the Norwegian test for adult immigrants. In this article, we demonstrate how the statistical model Many-Facets Rasch Measurement (MFRM) has been used to examine rater reliability in the written part of the test, using data from the December 2017 test. The MFRM model produces estimates on all raters in terms of severity and consistency. The results show large and significant variation in severity among the raters, and the candidates’ final results can be affected by which raters have assessed the test. Nevertheless, we find that most of the 77 raters assess consistently, showing high intra-rater reliability. This finding suggests that the raters, to a large degree, fulfil their role as independent experts with consistent rating behaviour. Finally, we discuss the challenges associated with the limitations of the test’s design, with respect to analysing rater reliability. We assess MFRM’s role and suitability, and identify possible areas of future study.

Keywords: language testing, written assessment, rater-mediated assessment, inter-rater reliability, intra-rater reliability, Many-Facet Rasch Measurement

Downloads

PDF (Norsk Bokmål)

Published

2018-11-27

How to Cite

Midtbø, T., Rossow, A., & Sagbakken, B. (2018). Måling av sensorreliabilitet ved vurdering av norskprøve i skriftlig framstilling. Acta Didactica Norge, 12(4), Art. 12, 25 sider. https://doi.org/10.5617/adno.6358

Download Citation

Issue

Vol. 12 No. 4 (2018)

Section

Thematic articles

License

Content published in Acta Didactica is - unless otherwise is stated - licensed through Creative Commons License BY-NC-ND 4.0. Content can be copied, distributed and disseminated in any medium or format under the following terms:

Attribution: You must give appropriate credit and provide a link to the license

Non-Commercial: You may not use the material for commercial purposes.

No derivatives: If you remix, transform, or build upon the material, you may not distribute the modified material.

No additional restrictions: You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

Notice: No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.

Authors who publish in Acta Didactica accept the following conditions:

Author(s) retains copyright to the article and give Acta Didactica rights to first publication while the article is licensed under the Creative Commons CC BY-NC-ND 4.0. This license allows sharing the article for non-commercial purposes, as long as the author and first publishing place Acta Didactica are credited.

The author is free to publish and distribute the work/article after publication in Acta Didactica, as long as the journal is referred to as the first place of publication. Submissions that are under consideration for publication or accepted for publication in Acta Didactica cannot simultaneously be under consideration for publication in other journals, anthologies, monographs or the like. By submitting contributions, the author accepts that the contribution is published online in Acta Didactica.

Måling av sensorreliabilitet ved vurdering av norskprøve i skriftlig framstilling

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Language

etterfoelger