Just because I had written extensive notes, I’d thought I’d just post them in a series of blogs. All blogs together in this pdf (might have made some slight changes over time in the blogs, which are not in the pdf).
Part 1 – foreword, introduction, chapter 1
Part 2 – chapters 2 and 3
Part 3 – chapters 4 and 5
Part 4 – chapters 6 and 7
Part 5 – chapter 8
Part 6 – chapter 9 and conclusion
In conclusion, I think that if a teacher wants to read a timely book with a lot of interesting content on assessment, they do well to read this one. They should, however, read it with the frame of mind that in places the situation is presented somewhat one-sidedly, in my view too negative about the ‘old’ situation and too positive about alternative models. Teachers can profit from that, but it can also mean that they miss out on decades of unmentioned research on curriculum, psychometrics and assessment. I would therefore encourage them to (i) read the book (ii) follow up the references and (iii) also read a bit wider. Of course, one cannot write a 1000 page ‘accessible’ book but given the number of footnotes a bit more depth in some places would have been good. Particular points are:
- Yes, the implementation of Assessment for Learning (AfL) has been problematic. The book covers some on the importance of feedback but not enough prior research is covered.
- I recognise the generic versus specific domain skills discussion but in my view it is presented in a too dichotomous way. There is more than Willingham, for example Sternberg and Roediger on critical thinking. In addition, linking it to leading to certain assessment practices (e.g. teaching to the test) is unevidenced. There also exist fair criticisms of deliberate practice.
- The introduction of a quality and difficulty model is useful but again rather binary.
- Reliability and validity are covered but only quite superficially (types of validity, threats to validity etc.), and reliability -in my view- is not covered correctly (the example with 1kg on a scale is an example of reliable AND valid and does not tease out the essential test-retest characteristic of reliability).
- Yes, there are problems with descriptor-based assessments but there is a raft of research addressing their validity and reliability.
- The progression model makes sense but haven’t people been doing this for decades? (e.g. in good textbooks).
- The attention given to the testing effect, spaced practice, multiple choice questions is well done.
- Comparative Judgement is worth examining (critically), but (i) no silver bullet, (ii) probably only applicable for niche objectives, (iii) several pressing questions still to ask, (iv) maybe its strength lies even more in the formative realm.
- The proposed integrated system describes what already is in place, with a plea to collaborate. This is good but we must realise that it not having worked out over the years, mainly is a funding issue, in my view.
One might wonder ‘why mention this, it’s great that this topic gets some attention?’ but I simply have to refer to what the author states towards the end of the book. ‘Assessment is a form of measurement’ and ‘flawed ideas about assessment have encouraged flawed classroom practice’ (p. 212). If these are the main aims behind the book, then it surely increases awareness of this, but without covering the basics more, I fear we don’t get the complete picture. Overall, I would say it’s an interesting, good book, but not outstanding. 3.5*.