I’m constantly challenging myself with regard to Comparative Judgement. In a first blog I explained why I think there might be some better reasons to use it than ‘efficient’, ‘workload’ and ‘more reliable’. I extended (and repeated) this in this book review. To me, professional development, moderation and formative feedback seem much more promising. However, I think many teachers, especially English teachers, are simply so disenamoured by KS2 English writing that they frankly see anything as improvement. They are willing to replace the current summative SATs for a different summative assessment. In the meantime I have seen several examples where I would say that teachers tried to strike a balance between both summative and formative elements. Good.
Recently, though, there is one particular aspect I have been thinking about some more, and that is the challenge of justifying a grade to one of your pupils. Is the ‘Comparative Judgement’ (CJ) process, as process that assigns grades to (subjective) work, not in danger of delegating a justification for the mark? If you are a pupil’s teacher and you give a certain mark, you know why this is the case. You go into a moderation meeting, knowing why you gave a particular mark. You might feel it is a bit subjective, and also that the criteria are ambiguous, but at least you have professional ownership of that mark. I expect that you can probably explain the mark to a reasonable degree. Even after moderation, you probably know why that particular piece scored what it scored. What about CJ? The judgement is holistic (at least in the so promoted version that saves time and so reduces workload). The grade is based on ‘the collective’ judgement of many judgers. There is no feedback as to the *why* for individual pieces of writing. So what is the justification for a particular grade? What will you tell your pupil? Maybe you feel simply referring to the collective opinion of a set of experts is enough, but surely we would want to be somewhat more precise in this?
One way to tackle this, it has been suggested, is a bank of annotated exemplars. It is not always clear whether these are meant for teachers and students or just teachers. If it’s just teachers then I guess we still have the same problem as before, in that pupils will not know about the *why* of a grade. If, however, they are used as exemplifications of when you get a higher of lower grade, I also think it’s a bit wishful thinking that pupils (but even teachers!) will simply scrutinise a pack of annotated examples, and then will extract where they can improve. It is ironic that this seems to incorporate a form of ‘discovery’: “discover the criteria we holistically and collectively used, and therefore don’t know ourselves but hey we are experts, to judge yourself what is important”. I predict that very swiftly, teachers will be formulating criteria again: ‘this exemplar had only few Spag errors, was well structured but not very original, and therefore scored higher than this other one that had a similar amount of Spag errors and structure but was highly original’. Always in comparative perspective, of course, an absolute mark -although assigned in a summative assessment- could only be justified in a relative notion. I continue to think that many of the challenges correctly diagnosed with descriptors and criterion-based assessments will continue to exist, but now with a myth that assessment is very easy: you just compare with others and holistically judge them. Rather than think this, I think it is better to appreciate assessment is hard and take in more general conceptions of what constitutes reliability and validity.
4 replies on “Explaining a grade”
Is that kind of feedback inconsistent with comparative judgements? Isn’t it essentially a ranking process, but based on the criteria as interpreted by many judges, and then in relationship to another piece of work. A student can consider grade boundaries, but grade criteria, less so. If we were to give the student access to a piece considered to be at a much higher level, would it be feasible to ask them to work out why theirs was ranked lower? Or vice-versa?
I think in the ‘pure’ form of CJ that kind of feedback is not present. If it’s only seen as a summative way to get grades some see the disconnect of the summative from the formative as a strength. I’m not sure if I agree, because apart from perhaps wanting to learn from yoiur weaker points, I’m more and more thinking a student at least deserves an explanation why they got the grade they got. Of course one can think of hybrid forms where there is both feedback and a ranking process (I think the Belgian project on CJ does this) but I would say that at least needs an acknowledgement that some form of (formative) feedback is needed.
On the suggestion about observing exemplars. i’ve heard that before as a suggestion. I think that is feasible, just as it is now. I do however think that this is even more challenging than studying exemplars in a criterion-referenced system. After all, interpretations are left to non-experts (students) and criteria are not made explicit. Certainly if a judgement of experts, as required by the CJ process, is multidimensional (let’s say experts ‘holistically’ judge structure, spag, flow, originality) I doubt students can easily infer this.
Would the truth of the grade they got be something like, ‘Well it was better than a C but not as good as an A’? Full explanations based on criteria are only really possible when grading is criterion-referenced. Ultimately, it’s one piece of writing and, as you say, for summative purposes. By the time a student’s piece is submitted, the formative process should have made strengths and weaknesses explicit. On the other hand, could CJ incorporate a quick checklist of key features as part of the process of judgement?
Based on the official CJ procedure, the idea is ‘holistic’ marking and hence any allusion to ‘key features’ would contradict this ‘holistic’ approach, I think. Especially things like ‘weights’ are hard to convey if you have more than one dimension.
To be fair, if CJ would only stay in the summative KS2 writing realm (subjective, fairly unidimensional, not too lengthy writing – 2 pages already is too long) I would not have so many questions about is, especially given the current problems. But as with all widely promoted ‘new’ things, I think ppl are losing sight of that aim and see all kinds of other applications. That would be fair enough, if it wasn’t often accompanied with a general ‘it’s all such a shambles now’, suggesting that the new procedure would solve all of this. But no:
1. CJ *only* solves criterion-based problems if you do not use any of them any more, including exemplars. The fact they the latter are proposed by some, simply means reintroducing criteria. Then don’t trash criterion-based assessment so much.
2. CJ only might mean less workload and time if you *only* use it summative.However, even then most studies show at least as much ‘collective’ effort (you might be done in 30 minutes but more people need to judge. Psychologically this might feel ‘less’ but at system level not, certainly if you need to train more people).
3. Following on from 2. If you want more (formative for example) or judge more than one dimension etc., this just ‘;eats’ into the total amount of work AND introduces those darn criteria again.
4. The final ‘sell’ often concerns reliability, but it simply is a different type of reliability. Like is not compared with like. Just imagine a combination of criteria and exemplars, and say 4 experts and rather than assigning marks those 4 are asked to order work. I think they are going to have a high agreement.