Categories
Education

Notes on Making Good Progress – Part 4

progressSometimes I just get carried away a bit. I managed to get an early copy of Daisy Christodoulou’s new book on assessment called Making Good Progress. I read it, and I made notes. It seems a bit of a shame to do nothing with them, so I decided to publish them as blogs (6 of them as it was about 6000 words). They are only mildly annotated. I think they are fair and balanced, but you will only think so if you aren’t expecting an incredulous ‘oh, it’s the most important book ever’ or ‘it is absolutely useless’. I’ve encountered both in Twitter discussions.

PART 1 | PART 2 | PART 3 | PART 4 | PART 5 | CONCLUSION

PART 4 – A MODEL OF PROGRESS AND PRINCIPLES OF ASSESSMENT

This part addresses chapters 6 and 7.

Chapter 6 describes the first of the alternative models, with the model of progress. I think it makes perfect sense to link summative and formative assessments, and I also applaud the suggestion that textbooks, or even digital textbooks, could play a larger role in the English curriculum. Here, I have been influenced by my Dutch background, where using textbooks (for maths, for example, my subject) is quite normal. There also is ample research on textbooks from other countries. ‘Progression’ also seems to refer to starting with basic skills and ‘progressing’ to next phases. I’m immediately thinking about (sequences of) task design, worked examples, fading of feedback, scaffolding, etc. These are all common elements of instructional design and multimedia learning and remain unmentioned. I think it’s good that the idea of ‘progression’ is made accessible for the average teacher but do wonder whether this is a missed opportunity. In designing their lessons teachers can be helped, even for the domain of assessment. It is followed up by some interesting threats to validity, including teaching to the test. I thought the author’s description of a progression model made sense; I imagine it is what humans have done over the centuries while designing curricula. Measuring the progression (p. 155) repeats the assumption that if you are interested in generic skills (I agree that with Christodoulou that’s not enough) you will grade frequently. In my mind it seems a bit of a rhetorical trick to make generic skill lovers complicit to a testing regime. It is interesting that Christodoulou mentions the word ‘multidimensional’ because I will later on see it as one of the summative shortcomings of comparative judgement, which promotes an holistic judgement over separate elements. Of course I agree with the advice we “need different scales and different types of assessment” (p. 159) and I also like the marathon analogy. But I do wonder what is new about that advice.

Then it’s onwards to principles for choosing the right assessments in chapter 7. To improve formative assessments some elements are promoted: specificity, frequency, repetition, and recording raw marks. I like how multiple-choice questions are ‘reinstated’ as being useful. I do think the advantages are exaggerated, especially because the subject context is disregarded, as well as multiple-choice ‘guessing’ strategies. It is notable that Christodoulou goes into the latter criticism and explains how the drawbacks could be mitigated. I think it would have been good if these had also been addressed for more subjective essays. The maths example of p. 167 is fair enough, but there technically (even with marking) is no reason to not make this an open question that can even provide feedback. I think it also would be useful to distinguish between different types of knowledge that should underpin questions. I think it is perfectly fine to give MC questions a firm place for diagnostics (or even diagnostic databases, as there already are many of them) but the author could highlight cutting edge potential more as well. Maybe it’s most useful to simply not say that one of question type is ‘best suited’ but to simply say one needs to ensure that the inferences drawn from the questions are valid; in other words the validity of them. ‘Validity’ seems to be a term that underpins a lot of the author’s thinking, which makes it a shame that it wasn’t treated more elaborately in chapter 3. I like how the testing effect, and Roediger and Karpicke’s work, features from page 169, as well as desirable difficulties (Bjork) and spaced and distributed practice. These are all very relevant and indeed could inform teachers how to better organise their assessments.

Categories
Education

Notes on Making Good Progress – Part 3

progressSometimes I just get carried away a bit. I managed to get an early copy of Daisy Christodoulou’s new book on assessment called Making Good Progress. I read it, and I made notes. It seems a bit of a shame to do nothing with them, so I decided to publish them as blogs (6 of them as it was about 6000 words). They are only mildly annotated. I think they are fair and balanced, but you will only think so if you aren’t expecting an incredulous ‘oh, it’s the most important book ever’ or ‘it is absolutely useless’. I’ve encountered both in Twitter discussions.

PART 1 | PART 2 | PART 3 | PART 4 | PART 5 | CONCLUSION

PART 3 – DESCRIPTOR- AND EXAM-BASED ASSESSMENT
This part addresses chapters 4 and 5.
Chapter 4 critiques descriptor-based assessments. I think it is important here to distinguish a bad implementation of a good policy or simply a bad policy. It starts by describing ‘assessment with levels’. I notice that the author often takes reading examples, which in principle is fine, but the danger is that we too quickly think it applies to all subjects. I think the chapter does a good job at describing the drawbacks of descriptor-based systems. I do, however, feel that some of them are not less prominent in alternatives presented later. I also get the feeling that apples and oranges are sometimes compared in the ‘descriptive, not analytic’ section because there is no reason to not simply do both. The comment on ‘generic, not specific’ with regard to feedback is spot on, but again there is no reason to not then do both: generic AND more specific feedback, in my opinion. Actually, throughout the book I feel that the novice/expert cut that had so skillfully been exposed, is not taken into account in many of the pages. As reviews of feedback use have shown, the type of feedback (and timing) interacts with levels of expertise. The examples of different questions seem related to their specific goal e.g. on page 94 the question on Stalin can be an excellent multiple choice question on certain knowledge. However, if it was more about relationships of certain events multiple choice questions might give away the game too much. The same with equations: multiple choice questions do not make sense if your aim is to check equation solving skill, but would make sense if you want to check if they can check the correctness of solutions. I think there is some confusion about reliability and validity here, most prevalent in the example on fractions. Yes, the descriptor on fractions is general but that is often part of a necessarily somewhat vague set of descriptors in a curriculum. What Christodoulou then gives as example (page 99) seems to be more about validity and reliability of tests and assessments. Decades of psychometric research have provided insight in how to reliably improve assessment for summative purposes. It feels as if this is under-emphasised. Also, descriptor systems can be made more precise by mark-schemes and exemplars (as, by the way, later on presented in the comparative judgement context). A pattern in  the book seems to be that

  1. The author provides some good critiques of the drawbacks of existing practices,
  2. But then does not mention research on mitigating drawbacks,
  3. Nevertheless, a case is made for changes with a ‘solution’
  4. But these solutions are not discussed in light of how they improve the drawbacks and/or introduce other drawbacks.

This could lead to a situation where readers might nod along with the critique but then incorrectly assume the proposed solutions will solve them. I think it is admirable to describe the challenges in this accessible way but would have preferred a more balanced approach. As a case in point, take the ‘bias and stereotyping’ of page 104. This is a real challenge, and rightly so seen as a point to address in descriptor-based assessment. Yet, as said before, there are ways to mitigate these drawbacks. Instead, the case is made that reform is necessary, and later on in the book a ‘solution’ is given that still uses teacher judgements but ‘simpler’ (not really, holistic judgement is not simple per se, only if you have a short uni-dimensional judgement to make, but the condemnation of teacher judgement wasn’t about that, it was about complex judgements). In my view it just ‘pretends’ to be a solution for these well-observed challenges.

Chapter 5 critically assesses another assessment type, namely exam-based assessment. The somewhat exaggerated style is exemplified by the first sentence “we saw that descriptor-based assessment struggles to produce valid formative and summative information”. The chapter first links the exam model to chapter 3’s distinction of the quality and the difficulty model. I am not convinced by the arguments that then try to explain why exam-based (summative) assessments are difficult to use for formative purposes. Sure, they are samples from a domain, but one can simply collect all summative questions on a certain topic or subject, to make valid inferences. Sure, questions differ in difficulty, but there are ways to analyse the difficulty. The comments on page 120 and 121 are fair (hard to say why right or wrong) but  I can’t help but think about the ‘solutions’ provided later on with comparative judgement, which uses Rasch analysis and ‘just correct or incorrect’, suffer a same problem (granted, they are presented as ‘summative’ solution). With maths exams there are mark-schemes, so a more fine-grained analysis *is* possible for formative purposes. The chapter *does* provide a nice insight in the difficulties regarding marking and judgement. A third problem, it is suggested, is that marks aren’t designed to measure formative progress. I think again that the book asks some good critical questions, but ultimately too much sends out the message that old practices are bad. From page 130 the author argues there are issues with the summative affordances of exams as well. I think that this section, again with the fractions examples, exaggerates the ‘non validity’ of exams. Testing agencies have developed a raft of tools to make exams valid over years and between samples. Again, the challenges and difficulties are described well, but ways to mitigate the challenges are undermentioned. Further, the suggested ‘modular’ approach is good but is this really new? The next four chapters are about alternative systems.

Categories
Education

Notes on Making Good Progress – Part 2

progressSometimes I just get carried away a bit. I managed to get an early copy of Daisy Christodoulou’s new book on assessment called Making Good Progress. I read it, and I made notes. It seems a bit of a shame to do nothing with them, so I decided to publish them as blogs (6 of them as it was about 6000 words). They are only mildly annotated. I think they are fair and balanced, but you will only think so if you aren’t expecting an incredulous ‘oh, it’s the most important book ever’ or ‘it is absolutely useless’. I’ve encountered both in Twitter discussions.

PART 1 | PART 2 | PART 3 | PART 4 | PART 5 | CONCLUSION

PART 2 – SKILLS AND RELIABILITY

This part addresses chapters 2 and 3.

As I think the two approaches in chapter 1 are a bit of a caricature, I wonder whether this continues in Chapter 2. At least there are some good examples of people who, in my view, utilise an exaggerated view of the ‘generic skills’ approach. Generic skills are rooted in domain knowledge. Yet, it is *not* the case that you will have to re-learn certain skills again and again in every domain. A good example is in my area of expertise maths and spatial/mental rotation skills. There is a (limited) amount of transfer within groups of domains. This is tightly linked to schema building etc. It is therefore unhelpful to present a binary choice here. What *is* good is to make people aware that generic skills courses *need* some domain knowledge.  The chapter uses quite a lot of quotes, some from the ‘7 myths’ approach of using Ofsted excerpts. Although I like this empirical element, and I even think there’s something in some of the claims, it would have helped if the quotes were something less ‘cherrypicking’ like. The fact that generic skills are mentioned are no evidence that they necessarily are a focal point of teaching. In fact, generic skills seem to be presented as an almost ‘automatic’ outcome of ‘just teaching’ in the deliberate-practice approach. So even in an approach the author seems to prefer, generic skills will probably be mentioned. The chapter of course goes on to name-checking ‘cognitive psychology’ and  Adriaan de Groot (like in Hirsch’s latest book). It is good that this research is tabulated, and the addition of ‘not easily transferable’ already shows a bit more nuance (p. 33). Schemas are mentioned and that it is good that ‘acquiring mental models’ is put central, and not ‘less cognitive load for working memory is best’. I felt these pages showed a wide range of references, though quite dated. I wholeheartedly agreed with the conclusion on page 37 that ‘specifics matter’ i.e. domain knowledge. It is telling that in discussing this, other ‘nuanced’ words appear, for example on page 38 when Christodoulou says ‘when the content changes significantly, skill does not transfer’. The interesting question then, in my mind, is when content is ‘significantly different’. My feeling is that this threshold is often sought far too low by some, and far too high by others. It would be good to discuss the ‘grey area’, just like the ‘grey area’ in going from a novice to an expert.

The section concludes with a plea for knowledge, practice etc. with which I very much agree. It becomes the prelude to a section on deliberate practice. It is an interesting section with a role for Ericsson’s work. Practice is extremely important; I do wonder, though whether the distinction performance and deliberate practice is more mixed than presented. Originally, the discussion about deliberate practice seemed to revolve around ‘effort versus talent’. This meta-review suggests there are more things in becoming an expert. Yes you practice specific tasks but I think it is perfectly normal to early on also ‘perform’, whether it is a test, a chess game or a concert. Or even look at an expert and see how they do (mimic). Not with the idea that you instantly become an expert but the idea that it all contributes to your path towards *more* expertise and solidify schema. Especially the link to ‘performance’ being too big a burden on working memory does not seem to be supported by a lot of evidence. It is not true that you can’t learn from reflecting on ‘performance’ as many post-match analyses show. Of course, one reason for this might be that the examples all are from ‘performing arts’ and sports, arguably more restrained by component skills leading to the ‘performance skill’, but after all it’s not me introducing these examples. At the bottom of page 41: “even if pupils manage to struggle through a difficult problem and perform well on it, there is a good chance that they will not have learnt much from the experience.” in my view plays semantics with the word ‘difficult problem’. I wonder why ‘there is a good chance’ this is the case. It also poses interesting questions regarding falsifiability, after all if a student does well on a post-test and has a big gain in an experimental research setting, maybe they haven’t learnt anything? Maybe they just performed well. By now, I have seen enough from the over-relied on Kirschner, Sweller and Clark paper. Bjork’s ‘over-learning’ is an interesting addition: I would agree it can be good to over-learn but we need to think about the magnitude (hours, days, weeks?) and unfortunately there is no mention of expertise reversal, worse performance. On page 42 and 43 I thought we would get to the crux (and difference) in aims of tasks, because I agree that those are key in learning. While acquiring mental schemas the cognitive load does not have to be minimal, just as long as those schemas are taught. In assessments you don’t want the cognitive load to be too high because you will fail your assessment. The chapter finishes with an ‘alternative method’ as a ‘model of progression’. I am not sure why this is called an ‘alternative’ because it sounds as if it has been around for ages. It even echoes Bruner’s scaffolding (oh no!). The attention to peer- and self-assessment is interesting, but I’m not sure if direct instruction methods really incorporate them, at least not in the often narrow terminology used in the edu blogosphere. Although I have seen a broadening of the definition through ‘explicit instruction’. I’m sure some will point out, that oft ridiculed progressive behaviour, of not understanding the definitions 😉 In sum, a useful chapter with a bit too much of a false choice.

The start of chapter 3 puzzles me a bit because it starts by explaining how summative and formative functions are on a continuum. I agree with that, and find it at odds with Wiliam’s foreword, in which he seemed to confess that the functions need to be separated. The chapter discusses the concepts of validity and reliability. I am not completely sure I agree with the formulation that validity only pertains to the inferences we make based on the test results, but I haven’t read Koretz. There are many types of validity and threats to validity, and I would say it *also* is important that a test simply measures what it purports to measure (construct validity); the many sides of the term should be discussed more. The comment on sampling is an important one. With reliability, I think the example with a bag of flour of 1 kg is an awkward choice, as it suggests a measure can only be reliable -in this case- as it shows 1 kg. This is not the case, scales that consistently measure say 100 grams over would still be a reliable scale, just not valid for the construct measured (mass). Reliability also isn’t an ‘aspect of validity’. When discussing unreliability it would have been helpful to have been more precise with explaining the ‘confidence bands’, and perhaps measurement errors. I get the feeling that the author wants to convey the message that measurements often are unreliable, but maybe I’m wrong. I very much like the pages (p. 64) on the quality and difficulty model; I agree that both models are accompanied by a trade-off between validity and reliability. There is a raft of literature on reliability and validity, Christodoulou chose only a few. As a whole, the chapter makes some useful links with summative and formative assessment. However, the example on page 70 is not chosen very well (and again note that there are many long quotes from other sources, more paraphrasing would be helpful), as in my view the first example (5a + 2b) *can* be a summative question if pupils are more expert (e.g. maths undergraduates). I like how Christodoulou tries to combine summative and formative assessments in the end, but wonder what new baggage we have learnt to make that happen.

Categories
Education

Notes on Making Good Progress – Part 1

progressSometimes I just get carried away a bit. I managed to get an early copy of Daisy Christodoulou’s new book on assessment called Making Good Progress. I read it, and I made notes. It seems a bit of a shame to do nothing with them, so I decided to publish them as blogs (6 of them as it was about 6000 words). They are only mildly annotated. I think they are fair and balanced, but you will only think so if you aren’t expecting an incredulous ‘oh, it’s the most important book ever’ or ‘it is absolutely useless’. I’ve encountered both in Twitter discussions.

PART 1 | PART 2 | PART 3 | PART 4 | PART 5 | CONCLUSION

PART 1 – THE BEGINNING
This part addresses the foreword, introduction and chapter 1.

I have been following the English education blogosphere for some time now. Daisy Christodoulou might be best known for her book ‘7 myths about education’ (and winning University Challenge with her team). ‘7 myths’ was a decent book with some nice and accessible writing, especially useful because it gave knowledge a bit more attention again. Points for improvement were the fact they weren’t really 7 myths in my view, 3 were variations of another myth, the empirical backing was a bit one-sided, and there was an error in quoting (revised) Bloom. But any way, a fresh voice, and some good ideas; bring it on, now in a new book on assessment.

The foreword of the book (again) is by Dylan Wiliam, best known perhaps for his ‘formative assessment’ work with Paul Black. After all the government malarkey on assessment with ‘assessment after levels’ he rightly so emphasises the timeliness of the book. Schools can make new assessment systems. Of course it is telling that a book needs to address this; it could be argued -especially when a government is keen to point at top performing PISA countries- that such an assessment system could be designed by a government. Of course, we now hear this more and more, but only after finishing the old system, opening the way to all kinds of empirically less grounded and tested practices. The foreword ends with a statement I am not convinced by, namely that formative and summative assessment might have have to be kept apart. For instance, it is perfectly acceptable to use worked examples from old summative assessments in a formative way. One could argue that both summative and formative assessments draw from the same source. In fact, in one of the promoted types of assessment, comparative judgement, one advice seems to be to use exemplars for students to know what teachers are looking for: a summative and formative mix.

One thing that immediately strikes me is that I love the formatting. The book has a nice layout and a good structure. Throughout the book, polygon diagrams perhaps suggest more structure than there is (who hasn’t used triangles ;-). Contrary to 7 myths each chapter seems to really tackle a separate issue, rather than the same issue in a different guise. The reference lists in the beginning are quite extensive. though for people who know the blogosphere a bit one-sided (Oates, Hirsch etc.). Later chapters have less references, and that is a shame because the second half is far more constructive and less ‘this and this is bad’ (more on that later). I can agree with a lot of criticisms in the first half, and even with the drawbacks of ‘levels’, but I am less convinced that some of the proposed alternatives will be an improvement. More evidence would have worked there.

The book starts with an introduction. Unfortunately the introduction immediately sets the tone, and in an un-evidenced way. “In the UK, teacher training courses and the Office for Standards in Education, Children’s Services and Skills (Ofsted) encouraged independent project-based learning, promoted the teaching of transferable skills, and made bold claims about how the Internet replace memory.” I find that a gross generalization. Of course I know about the Robinson’s and Mitras of the world, and there probable *are* people in those organisations saying this (and outside), but is it rife? It is a pattern that also was apparent in ‘7 myths’. The sentence after that with ‘pupils learn best with direct instruction’ (no, novice pupils, it can even backfire with better pupils, so-called expertise reversal) and ‘independent projects overwhelm our limited working memories’ (no, this depends on the amount of germane load or, if you will, element interaction) in my view are caricatures of the scientific evidence. Often this has been parried in debates that it is reasonable to simplify it this way. I’m not sure; my feeling is that this is actually how new myths take hold. Luckily, what follows is a good explanation and problem statement for the book; I think it is good to tackle the topic of assessment.

Chapter 1 starts with a focus on Assessment for Learning (AfL). I think the analysis of why AfL failed, partly focussing on the role and types of feedback, is a good one. Black and Wiliam themselves emphasised the pivotal role of feedback, in that it needed to lead to a change in behaviour in the students. This did not seem to happen well enough. On page 21 it is ironic, given what follows in later chapters, that Christodoulou writes “When government get their hands on anything involving the word ‘assessment’, they want it to be about high stakes monitoring and tracking, not low-stakes diagnostics.” I feel that when Nick Gibb embraces ‘comparative judgement’, this is exactly what is happening. The analysis then continues, on page 23, with sketching two broad approaches in developing skills in the ‘generic skills’ and ‘deliberate practice’ methods. I had the well-known ‘false dichotomy’ feeling here. By adding words like ‘generic’ and also linking one approach to ‘project-based’ I felt there clearly was an ‘agenda’ to let one approach be ‘wrong’ and one ‘correct’. It even goes as far on page 26 to say that the ‘generic skills’ method leads to more focus on exam tasks. No real support for this supposition. Actually, some deliberate practice methods focus on ‘worked examples’ where using exam tasks would be reasonable but also ‘working with exam tasks’. I agree that approaches should be discussed, by the way, but as so many discussions on the web, not in a dichotomous way if evidence points to more nuance.

Categories
Education

Thoughts on Comparative Judgement

This is just a quick post to collect some thoughts on (Adaptive) Comparative Judgement. Recently it seems to have gotten a lot of traction in the UK education blogosphere, to the extent that even the minister and experts for the Education Select Committee are already mentioning it, and the NAHT already mention it. I think the technique, originally Thurstone and in the adaptive form by Pollitt technically is great but I think the advantages might be exaggerated, certainly on a national, summative scale. Nevertheless, I hardly see any critical voices, except for this blog, which is excellent on this and being an arch-skeptic I’d thought I’d write down some of my thoughts.
  • Firstly it is important to take into account to the nature of the tasks being assessed. For example the subject. If we are talking about some subjects, mark-schemes are perfectly fine and pretty reliable (comment on what we mean by that later on); it is with more subjective tasks there are challenges. So perhaps maths less of a problem than say an English essay. This Ofqual review is sometimes referenced, and it gives a far more nuanced position on reliability of marking, although they base a lot on Meadows and Billington (2005).
  • But even then, as the Ofqual review of marking reliability shows, there are decades of procedures and actions you can take to get increased validity and reliability. It seems as if the idea has taken hold that we have marked unreliably for decades. The question is not whether it’s unreliable or not, but whether reliability was enough (note that this also means you need a good conception of what reliability and validity are, not always sure of that).
  • The ‘enough’ question partly depends on what you want to do with it, I guess. The more high stakes the nature of the assessment the more important it is. One could even see a ‘cut’ between whether you use the assessment for formative or summative purposes. I do not see enough discussion about these aspects.
  • In the discussion it also is important to say what ‘reliability’ is any way. Agreeing on a rank (“I think A is better than B”) is different to agreeing on a quantification of the mark (“A is 80, B=60”). To compare ‘like with like’ you need to use the same type of measure.
  • Some CJ literature has shown than some of the challenges of traditional marking of course still apply: for example the influence of the length of assessments, or multidimensional nature (also note the potential subject differences again). The assumption that you just ‘say which is the better work’, holistically, and it then will lead to a lot of agreement (statistically) seems tricky to me. Even if your conception of the “better piece of writing”, “OK to go with your gut instinct”, is clear (is it? Can’t groups engage in a groupthink?), it still is important to know what to look for e.g. originality, spelling, handwriting style. Certainly if at one point you do want to give students some feedback or exemplars of higher scoring work. This also touches the ‘summative/formative’ issue.
  • Which leads me to think that the ‘old’ situation is painted too much as ‘not good enough’ and the new one as improving many things, in the summative sense.
  • Certainly if we take into account claims like ‘more efficient’, ‘less time’ and ‘reduces workload’, I think it’s too facile to say we can get all that AND more reliable.

I think the developments around ‘comparative judgement’ at a potential policy level are going far too quickly, and are based on some misconceptions of validity and reliability, and -maybe most of all- purported time-savings and workload reduction. In my view ‘more efficient’ and ‘more reliable’ aren’t reasons why you would want CJ (or actually ACJ), as in my view these advantages over traditional assessments only might hold for only a small part of summative assessments, namely those that are relatively short, uni-dimensional and subjective (e.g. a short English essay). And even then it pays to check the costs involved, from scanning work, all the training courses, the marking time etc. Simply suggesting that at one point those costs will all go and you are left with a nice 30-second comparison, is not really painting the full picture. We always need to ask that age-old important social media question ‘is that worth the opportunity cost’. This does not mean we couldn’t continue trying out piloting and testing (producing evidence) of exactly the issues I’ve just mentioned. And if we’re doing that any way, we might actually look at some applications that look more promising to me than a limited, national, summative application:

  • As tool for Continuing Professional development with teams of teachers, to increase awareness of marking practices. We sometimes actually did that with our department when I was teaching in the Netherlands.
  • Meso-moderation: at the level of groups of schools.
  • Experimenting with assessment types not yet used, for example open-ended questions in maths.
  • Peer assessment. The evidence base could be linked to emerging research on peer assessment and formative practices.

Finally, what I always would like is that pilots and experiments (no policy changes please!) are facilitated, and do not involve promoting paid services at this point. Apart from No More Marking (whom I think have started to charge?), maybe the open source platform featured by this Belgium project D-PAC might be interesting.

Categories
Education Research ICT Math Education MathEd Tools

Recent presentation: Mental Rotation Skills

I gave two paper presentations recently at the BSRLM day conference in Brighton. Abstracts and slides are below.

Bokhove, Christian* & Redhead, Ed
University of Southampton
c.bokhove@soton.ac.uk
@cbokhove
Training mental rotation skills to improve spatial ability
Prior research indicates that spatial skills, for example in the form of Mental Rotation Skills (MRS), are a strong

predictor for mathematics achievement. Nevertheless, findings are mixed whether this is more the case for other

spatial tasks or, as others have stated, numerical and arithmetical performance. In addition, other studies have
shown that MRS can be trained and that they are a good predictor of another spatial skill: route learning and
wayfinding skills. This paper presentation explores these assumptions and reports of an experiment with 43
undergraduate psychology students from a Russell Group university in the south of England. Participants were
randomly assigned to two conditions. Both groups made pre-and post-tests on wayfinding in a maze. In-between
the intervention group trained with an MRS tool the first author designed in the MC-squared platform, which was
based on a standardized MRS task (Ganis & Kievit, 2015). The control group did filler tasks by completing crossword puzzles. Collective ly, the 43 students made 43×48=2064 assessment items for MRS, and 2×43=86 mazes. Although the treatment group showed a decrease in time needed to do the maze task, while the control group saw an increase, these changes were not significant. Limitations are discussed.
http://www.slideshare.net/cbokhove/mental-rotation-skills
Categories
Education Research ICT Math Education MathEd Tools

Recent presentation: digital books for creativity

I gave two paper presentations recently at the BSRLM day conference in Brighton. Abstracts and slides are below.

Geraniou, Eirini*, Bokhove, Christian* & Mavrikis, Manolis
UCL Institute of Education, University of Southampton, UCL Knowledge Lab
e.geraniou@ucl.ac.uk
Designing creative electronic books for mathematical creativity
There is potential and great value in developing digital resources, such as electronic books, and investigating their
impact on mathematical learning. Our focus is on electronic book resources, which we refer to as c-books, an
d are extended electronic books that include dynamic widgets and an authorable data analytics engine. They have been designed and developed as part of the M C Squared project (www.mc2-project.eu/), which focuses on social
creativity in the design of digital media intended to enhance creativity in mathematical thinking (CMT).
Researchers collaborating with mathematics educators and school teachers form Communities of Interest and
Practice (COI and COP) that work together to creatively think and design c-book resources reflecting current
pedagogy for CMT in schools. We plan to present a number of these books and discuss how they were designed.
We will share our reflections from using one of the c-books for a school study and highlight its impact on students’
learning, but also how c-books could be integrate d in the mathematics classroom.
http://www.slideshare.net/cbokhove/designing-creative-electronic-books-for-mathematical-creativity
Categories
Uncategorized

Thoughts on memorisation

I don’t feel I have the time to really write (or rather finish) a blog piece on memoriation. But this sequence of tweets, as a reaction to some memorisation claims in Scientific American give some support for why I think that image is a caricature. Memorisation, procedures, understanding all go hand in hand (see for example Rittle-Johnson, Siegler, Alibali, Star etc.). I leave it to others to write a real blog 😉

memori

https://storify.com/cbokhove/thoughts-on-memorisation

Categories
Education Research

Keynote at CADGME 2016

These are the slides to the keynote I did at CADGME 2016

Categories
Education Research Math Education Research Statistical Methods

Presentation ICME13

This is the presentation I gave at ICME-13:

OPPORTUNITY TO LEARN maths: A curriculum approach with timss 2011 data
Christian Bokhove
University of Southampton

Previous studies have shown that socioeconomic status (SES) and ‘opportunity to learn’ (OTL), which can be typified as ‘curriculum content covered’, are significant predictors of students’ mathematics achievement. Seeing OTL as curriculum variable, this paper explores multilevel models (students in classrooms in countries) and appropriate classroom (teacher) level variables to examine SES and OTL in relation to mathematics achievement in the 2011 Trends in International Mathematics and Science Study (TIMSS 2011), with OTL operationalised in several distinct ways.  Results suggest that the combination of SES and OTL explains a considerable amount of variance at the classroom and country level, but that this is not caused by country level OTL after accounting for SES.

Full paper, slides: