Category: Education Research

Recent presentation: Mental Rotation Skills

Post author By cbokhove
Post date November 23, 2016
No Comments on Recent presentation: Mental Rotation Skills

I gave two paper presentations recently at the BSRLM day conference in Brighton. Abstracts and slides are below.

Bokhove, Christian* & Redhead, Ed

University of Southampton

c.bokhove@soton.ac.uk

Training mental rotation skills to improve spatial ability

Prior research indicates that spatial skills, for example in the form of Mental Rotation Skills (MRS), are a strong

predictor for mathematics achievement. Nevertheless, findings are mixed whether this is more the case for other

spatial tasks or, as others have stated, numerical and arithmetical performance. In addition, other studies have

shown that MRS can be trained and that they are a good predictor of another spatial skill: route learning and

wayfinding skills. This paper presentation explores these assumptions and reports of an experiment with 43

undergraduate psychology students from a Russell Group university in the south of England. Participants were

randomly assigned to two conditions. Both groups made pre-and post-tests on wayfinding in a maze. In-between

the intervention group trained with an MRS tool the first author designed in the MC-squared platform, which was

based on a standardized MRS task (Ganis & Kievit, 2015). The control group did filler tasks by completing crossword puzzles. Collective ly, the 43 students made 43×48=2064 assessment items for MRS, and 2×43=86 mazes. Although the treatment group showed a decrease in time needed to do the maze task, while the control group saw an increase, these changes were not significant. Limitations are discussed.

http://www.slideshare.net/cbokhove/mental-rotation-skills

Education Research ICT Math Education MathEd Tools

Recent presentation: digital books for creativity

Post author By cbokhove
Post date November 23, 2016
No Comments on Recent presentation: digital books for creativity

I gave two paper presentations recently at the BSRLM day conference in Brighton. Abstracts and slides are below.

Geraniou, Eirini*, Bokhove, Christian* & Mavrikis, Manolis

UCL Institute of Education, University of Southampton, UCL Knowledge Lab

e.geraniou@ucl.ac.uk

Designing creative electronic books for mathematical creativity

There is potential and great value in developing digital resources, such as electronic books, and investigating their

impact on mathematical learning. Our focus is on electronic book resources, which we refer to as c-books, an

d are extended electronic books that include dynamic widgets and an authorable data analytics engine. They have been designed and developed as part of the M C Squared project (www.mc2-project.eu/), which focuses on social

creativity in the design of digital media intended to enhance creativity in mathematical thinking (CMT).

Researchers collaborating with mathematics educators and school teachers form Communities of Interest and

Practice (COI and COP) that work together to creatively think and design c-book resources reflecting current

pedagogy for CMT in schools. We plan to present a number of these books and discuss how they were designed.

We will share our reflections from using one of the c-books for a school study and highlight its impact on students’

learning, but also how c-books could be integrate d in the mathematics classroom.

http://www.slideshare.net/cbokhove/designing-creative-electronic-books-for-mathematical-creativity

Education Research

Keynote at CADGME 2016

These are the slides to the keynote I did at CADGME 2016

Cadgme2016 keynote final from Christian Bokhove

Education Research Math Education Research Statistical Methods

Presentation ICME13

This is the presentation I gave at ICME-13:

OPPORTUNITY TO LEARN maths: A curriculum approach with timss 2011 data
Christian Bokhove
University of Southampton

Previous studies have shown that socioeconomic status (SES) and ‘opportunity to learn’ (OTL), which can be typified as ‘curriculum content covered’, are significant predictors of students’ mathematics achievement. Seeing OTL as curriculum variable, this paper explores multilevel models (students in classrooms in countries) and appropriate classroom (teacher) level variables to examine SES and OTL in relation to mathematics achievement in the 2011 Trends in International Mathematics and Science Study (TIMSS 2011), with OTL operationalised in several distinct ways. Results suggest that the combination of SES and OTL explains a considerable amount of variance at the classroom and country level, but that this is not caused by country level OTL after accounting for SES.

Full paper, slides:

ICME 2016 presentation from Christian Bokhove

Education Research

Economic papers about education (CPB part 2)

Post author By cbokhove
Post date July 11, 2016
No Comments on Economic papers about education (CPB part 2)

This is a follow-up post from this post in which I unpicked one part of large education review. In that post I covered aspects of papers by Vardardottir, Kim, Wang and Duflo. In this post I cover another papers in that section (page 201).

Booij, A.S., E. Leuven en H. Oosterbeek, 2015, Ability Peer Effects in University: Evidence

from a Randomized Experiment, IZA Discussion Paper 8769.

This is a discussion paper from the IZA series. This is a high quality series of working papers, but this -of course- is not yet a peer-reviewed journal version. Maybe there is one at the moment but clearly this version was used for the review. Previously I had already noticed there could be considerable differences between working papers and the final version, just see Vardardottir’s evaluation of Duflo et al.’s paper.

The paper concerns undergraduate economics students. Of course a first observation would be that it might be difficult to generalize wider than ‘economics undergraduates from a general university in the Netherlands’. Towards the end it is however argued that together with other papers (Duflo, Carrell) a pattern results is emerging. The first main result is in Table 4.

The columns show how the models were built. Column (1) has the basic model with only the mean of peers’ Grade Point Average (GPA) and ‘randomization controls’ are included. Column (2) adds controls like ‘gender’, ‘age’ and ‘professional college’. Column (3) adds the Standard Deviation (SD) of peers’ GPA in a tutorial group. Columns (1) to (3) do not show any effect. Only in column (4), where non-linear terms and an interaction are added, some significant variables appear. This can be seen by the **. The main result seems rather borderline, but ok, in the context of ability grouping it is Table 5 that is more interesting.

In that table different tracking scenarios are studied. The first column is overall effects compared to ‘mixed’, so this looks at the ‘system’ as a whole. Columns (2) to (4) show the differentiated effects. From this table I would deduce:

In two-way tracking lower ability gain a little bit (10% significance in my book is not significant), higher ability gain a little bit (borderline 5%)
Three way tracking: middle and low gain some, high doesn’t.
Track Low: low gains, middle more (hypothesis less held back?), high doesn’t.
Track Middle: only middle gains (low slightly negative but not significant!)
Separate high ability: no one gains.

This is roughly the same as what is described in the article on page 20. The paper then also addresses average grade and dropout. Actually, the paper goes into many more things (teachers, for example) which I will not cover. It is interesting to look at the conclusions, and especially the abstract. I think the abstract follows from the data, although I would not have said “students of low and medium ability gain on average 0.2 SD units of achievement from switching from ability mixing to three-way tracking.” because it seems 0.20 and 0.18 respectively (so 19% as mentioned in the main body text). Only a minor quibble, which after querying, I heard has been changed in the final version. I found the discussion very limited. It is noted that in different contexts (Duflo, Carrell) roughly similar results are obtained (but see my notes on Duflo).

Overall, I find this an interesting paper which does what it says on the tin (bar some tiny comments). Together with my previous comments, though, I would still be weary about the specific contexts.

Tags economics, education

Education Research

Unpicking economic papers: a paper on direct instruction

Post author By cbokhove
Post date July 4, 2016
No Comments on Unpicking economic papers: a paper on direct instruction

This paper has the title “Is traditional teaching really all that bad?” and is by Schwerdt and Wuppermann makes clear that this paper sets out to show it isn’t. And without this paper I would have said the same thing. Simply because I wouldn’t deny that ‘direct instruction’ has had a rough treatment in the last decades.

There are several versions of this paper on SSRN and other repositories. The published version is from ‘Economics of Education Reviw’, and this immediately shows why I have included it. In the advent of economics papers some have preferred to use this paper rather than a more sociological, psychological or education research approach.

The literature review is, as often the case in my opinion in economics papers, a bit shallow. The study uses TIMSS 2003 year 8 data (I don’t know why they didn’t use 2007 data).

I find the wording “We standardize the test scores for each subject to be mean 0 and standard deviation 1.” a bit strange because the TIMSS dataset, as in later years, does not really have ‘test scores per subject’ because subjects do not make all the assessment items.

(link)Instead, there are five so-called ‘plausible values’. Not using them might underestimate the standard error, which might lead to results being significant more swiftly. This variable is the outcome, another variable is the question 20.

The distinction between instruction and problem solving are based on three of these items: b is seen as direct instruction, c and d together problem solving (note that one of course does mention ‘guidance’). There is an emphasis on ‘new material’ so I can see why these are chosen. Of course the use of percentages means that an absolute norm is not apparent, but I can see how lecture%/(lecture%+problemsolving%) denotes a ratio of lecturing. The other five elements are together used as control. Mean imputation was used (I can agree that imputation method probably did not make a difference) and sample weights (also good, contrary to no plausible values).

Table 1 in the paper tabulates all the variables and shows some differences between maths and science teachers, for example in the intensity of lecture style teaching. The paper then proposes a model “standard education production function”. In all the result tables we can certainly see the standard p=.10 and again with large N’s this, to me, seems unreasonable. A key result is in Table 4:

The first line is the lecture style teaching variable. Columns 1 and 3 show that Math is significant (but keep in mind, at 5% with high N. However, 0.514 does sound quite high) and Science is not. Columns 2 and 4 then have the same result but now by taking into account school sorting based on unobservable characteristics of students through inclusion of fixed school effects. I find the pooling a bit strange, and reminds me of the EEF pooling of maths mastery for primary and secondary to gain statistically significant results. Yes, here too, both subjects then yield significant results. Together with the plausible values issue I would be cautious.

Table 5 extends the analysis.

The same pattern arises. The key variable is significant at the questionable 10% level (column 1) and a bit stronger after adding confounding variables (at the 5% level, but again with high N). The articles notices that over the columns the variable is quite constant, but also that it’s lower than the Table 4 results, showing that there are school effects.

There is footnote on page 373 that might have received a bit more attention. I find the reporting a bit strange because the first line indicates that variable ranges from 0.11 to 0.14, not 0.14 to 0.1 (and why go from a larger to a smaller number, is this a typo?). Overall, 1% of an SD seems very low. I think the discussion that follows is interesting and adds some thoughts. I thought it was interesting that was said “Our results, therefore, do not call for more lecture style teaching in general. The results rather imply that simply reducing the amount of lecture style teaching and substituting it with more in-class problem solving without concern for how this is implemented is unlikely to raise overall student achievement in math and science.”. Well, that does seem a balanced conclusion, indeed. And again, a strong feature for most economic papers, the robustness checks are good.

In conclusion, I found this an interesting use of a TIMSS variable. Perhaps it could be repeated with 2011 data, and now include all five plausible values (perhaps a source of error). Nevertheless, although I think strong conclusions in favour of lecturing could be debated, likewise it could be said that there also are no negative effects of it: there’s nothing wrong with lecturing!

Tags education, instruction

Education Research

Unpicking economic papers: a paper on behaviour

Post author By cbokhove
Post date June 29, 2016
1 Comment on Unpicking economic papers: a paper on behaviour

One of the papers that made a viral appearance on Twitter is a paper on behaviour in the classroom. Maybe it’s because of the heightened interest in behaviour, for example demonstrated in the DfE’s appointment of Tom Bennett, and behaviour having a prominent place in the Carter Review.

Carrell, S E, M Hoekstra and E Kuka (2016) “The long-run effects of disruptive peers”, NBER Working Paper 22042. link.

The paper contends how misbehaviour (actually, domestic violence) of pupils in a classroom apparently leads to large sums of money that people will miss out of later in life. There, as always, are some contextual questions of course: the paper is about the USA, and it seems to link domestic violence with classroom behaviour. But I don’t want to focus on that, I want to focus on the main result in the abstract: “Results show that exposure to a disruptive peer in classes of 25 during elementary
school reduces earnings at age 26 by 3 to 4 percent. We estimate that differential exposure to children
linked to domestic violence explains 5 to 6 percent of the rich-poor earnings gap in our data, and that
removing one disruptive peer from a classroom for one year would raise the present discounted value
of classmates’ future earnings by $100,000.”.

It’s perfectly sensible to look at peer effects of behaviour of course, but monetising it -especially with a back of envelope calculation (actual wording in the paper!)- is on very shaky ground. The paper respectively looks at the impact on test scores (table 3), college attendance and degree attainment (table 4), and labor outcomes (table 5). The latter is also the one reported in the abstract.

There are some interesting observations here. The abstract’s result is mentioned in the paper “Estimates across columns (3) through (8) in Panel A indicate that elementary school exposure to one additional disruptive student in a class of 25 reduces earnings by between 3 and 4 percent. All estimates are significant at the 10 percent level, and all but one is significant at the 5 percent level.” The fact economists would even want to use 10% (with such a large N) is already strange to me. Even 5% is tricky with those numbers. However, the main headline in the abstract can be confirmed. But have a look at panel C. It seems there is a difference between ‘reported’ and ‘unreported’ Domestic Violence. Actually, reported DV has a (non-significant) positive effect. Where was that in the abstract? Rather than a conclusion along the lines whether DV was reported or not, the conclusion only focuses on the negative effects of *unreported* DV. I think it would be more fair to make a case for better signalling and monitoring of DV, so that negative effects of unreported DV are countered; after all, there are no negative effects on peers when reported.

Tags economics, education, research

Education Research

Economic papers about education (CPB part 1)

Post author By cbokhove
Post date June 22, 2016
3 Comments on Economic papers about education (CPB part 1)

Introduction

It feels as if there has been an incredible surge of econometric papers in social media. Like a lot of research they sometimes are ‘pumped around’ uncritically. Sometimes it’s the media, sometimes it’s a press release from the university, sometimes it’s even the researchers themselves who seem to want a ‘soundbite’. These econometric papers are fascinating. What they often have going for them -according to me- is their strong, often novel, mathematical models (for example Difference in Differences or Regression Discontinuity Design. I also like how after presenting results there often are ‘robustness’ sections. However, they also often lack a sufficient literature overview; one that often is biased towards econometric papers (yet, it is quite ‘normal’ that disciplines cite within disciplines). Also, conclusions, in my view, lack sufficient discussion of limitations. Finally, I often find that the interpretation of the statistics is a bit ‘typical’, in that econometric papers seem to love to use significance testing (NHST) with p=.10 (yes, I know of criticisms of NHST) and try to summarize the findings in a rather ‘simplistic’ way. The latter might be caused by an unhealthy academic ‘publish or perish’ culture in which we sometimes feel only extraordinary conclusions are worth publishing (publication bias).

Unpicking some economics papers

Some people have asked me what I look for in such papers. In this first blog I will use some papers from a recent report of the CPB, the Netherlands Bureau for Economic Policy Analysis. They recently released a report on education policy, summarizing the effectiveness of all kinds of educational policies. As the media loved to quote on a section on ability grouping, who seemed to say that ‘selection worked’, I focused on that part. It also was the topic of a panel discussion at researchEd maths and science, so it also was something I had looked into any way. The research is mixed. It struck me that, as could be expected from an economic policy unit, the studies were almost all economically oriented. Of course some went as far as suggesting that the review just had high standards and that maybe therefore educational and sociological research did not make the cut (because of inclusion criteria, see p. 330 of the report, in Dutch). This all-too positive view of economic research, and less so of other research, in my view is unwarranted. It has more to do with traditions within disciplines. In this case I want to tabulate some of my thoughts about the papers around ability grouping within one type of education (p. 200 of the report). I won’t go into the specifics of the Dutch education system but it suffices to say that the Netherlands has several ‘streams’ based on ability, but within the streams students are often grouped by mixed ability. This section wanted to look at studies that looked at ability grouping within each of those streams. The media certainly made it that way.

Duflo et al

The study that I recognized, as it featured in the Education Endowment Fund toolkit, was the paper by Duflo et al. It is a study of primary schools in Kenya.

Duflo, E., P. Dupas en M. Kremer, 2011, Peer effects, teacher incentives, and the impact of tracking: evidence from a randomized evaluation in Kenya, American Economic Review, vol. 101(5): 1739-1774.

The paper first had been published as an NBER working paper. There is a difference in the wording of the abstracts of working and published paper, but in both cases the main effect is:

Tracking schools scored 0.14SD higher.
The result comes from Table 2.
But the paper uses 10% significance level to get a significant result. N is large, although number schools -not mentioned in the table- lower.
Long-run effects seem stronger but it would be hard to argue nothing else could have influenced this.

In sum, I think we would need to be a bit careful in concluding ‘ability grouping’ works.

Vardardottir

Interestingly, Vardardottir points out the non-significant findings of Duflo et al., although in a preliminary paper there was a bit more discussion about the original Duflo et al. working paper. Maybe this is about different results, but I thought it was poignant.

The study, conducted in Iceland and in secondary school (16 yr olds) finds “Being assigned to a high-ability class increases academic achievement”. I thought there was a lot of agreement between the data and the findings. The study is about ‘high ability classes’ and the CPB report says exactly that. This seems to correspond with educational research reviews as well: the top end of ability might profit from being in a separate ability group. However, a conclusion about ability grouping ‘in general’ for all ability groups is difficult to make here.

Vardardottir, A., 2013, Peer effects and academic achievement: A regression discontinuity approach, Economics of Education Review, vol. 36: 108-121.

Kim et al and Wang

A third paper mentioned in the report is one by Kim et al.. Another context: secondary school, and one set in South Korea. It concludes that: “First, sorting raises test scores of students outside the EP areas by roughly 0.3 standard deviations, relative to mixing. Second, more surprisingly, quantile regression results reveal that sorting helps students above the median in the ability distribution, and does no harm to those below the median.”. As an aside, it’s interesting to see that the paper had already been on SSRN (now bought by Elsevier) since 2003. This begs the question, of course, from what year the data is. This always is a challenge; peer review takes time and often papers concern situations from many years before. In the meantime things (including policies) might have changed.

Kim, T., J.-H. Lee en Y. Lee, 2008, Mixing versus sorting in schooling: Evidence from the equalization policy in South Korea, Economics of Education Review, vol. 27(6): 697-711.

The paper uses ‘Difference-in-Differences’ techniques. I think the overall effect (the first conclusion), based on this approach is quite clear. I personally don’t find this very surprising (yet) as most literature tends to confirm that positive effect. However, criticism to it often is along the lines of equity i.e. like Vardardottir high ability profiting most from this, with lower ability not profiting or being even worse off. Interestingly (the authors also say ‘surprisingly’), the quantile regression seems to go into that:

The footnote summarizes the findings. If I understand correctly, the argument is that with controls, column (2) gives the overall effect per quantile of the ability grouping. This is clear: at 1% significant effects for all groups. The F-value at the bottom tests for significant differences, and is not significant (>.1, yes economists use 10%), hence the statement ‘no significant differences’ between different abilities. Based on column (2) one could say that; we could of course also say that a difference of .320SD versus .551SD is rather large. But what’s more interesting, is the pattern of significant effects over the subjects: those are all over the place in two ways. Firstly, in the differential effects on the different ability groups e.g. in English significantly larger positive effects higher ability than lower ability (just look at the number of *), in Korean significantly more negative effects for lower ability. (Note, that I did see that other control variables weren’t included here, I don’t know why, there is something interesting going on here any way, as there are differences first in column (1) but controls in (2) make them non-significant). Furthermore, the F-values at the bottom show that only for maths there are no significant differences, for all the other subjects there are, some quite sizable. What seems to be happening here is that all the positive and negative effects over the ability groups roughly cancel each other out, yielding no significant difference. Maybe they go away when including controls, but that can’t be checked. What is clear, I think, is that there are differences between subjects. I think the conclusion in the abstract “sorting helps students above the median in the ability distribution, and does no harm to those below the median” therefore needs further nuance.

Therefore it was useful there was a follow-up article by Wang. One thing addressed here is the amount of tutoring: an example of how different disciplines could complement each other i.e. Bray’s work on Shadow Education.

Wang, L. C., 2014, All work and no play? The effects of ability sorting on students’ non-school inputs, time use, and grade anxiety, Economics of Education Review, vol. 44: 29-41.

The article is, however, according to the CPB report premised on the assumption that there are null effects on lower-than-average-ability. Effects that, in my view, already deserve nuance based on subject differences. It therefore is very interesting that Wang looks at tutoring, homework etc. but the article seems to not continue with subject differences. This is a shame, in my view, because from my on mathematics education background -and as stated at that researchEd maths and science panel- I can certainly see how maths might be different to languages. It would have been a good opportunity to also think about top performance of Korea in international assessments, for example. Yet the take-away message for the CPB seems to be ‘ability grouping works’.

Conclusions not warranted?

There are more references, which I will try to unpick in future blogs. These will also include papers on teaching style, behavior etc. all education topics for which people have promoted economics papers as ‘definitive proof’. There also are multiple working papers (the report argues that because some series often end up as peer-reviewed articles any way, they might be included, like NBER and IZA papers.) which I might cover.

Nevertheless, this first set of papers, in my view, does not really warrant the conclusion ‘ability groups work’. Though to be fair, in many cases the abstracts might make you think differently. It shows that actually reading the original source material can be important. Yet, even if we assume they do say this, the justification that follows at the end of the paragraph is strange (translated): “The literature stems, among others, from secondary education and, among others, from comparable Western countries. The results point in the same direction, disregarding school type or country. That’s why we think the results can be translated to the Dutch situation.”. Really? Research from primary, secondary and higher education (that’s the Booij one). From Kenya, from Korea (with its shadow education)?

Final thoughts

What we have here is a large variety of educational contexts, both in school type(s), years and countries, with confusing presentation of findings with, in my view, questionable p-value. OK, now I’m being facetious; I just want people to realize that every piece of research has drawbacks. They need to be acknowledged, just like the strong(er) points. If we see quality of research as a dimension from ‘completely perfect’ (would be hard-pressed to find that) and ‘completely imperfect’, there are many many shades in-between. ‘Randomized’ is often seen as a gold standard (I still feel that this also comes with issues but that is for another blog), yet economists have deemed all kinds of fine statistical techniques as ‘quasi experimental’ and therefore ‘still good enough’. Yet, towards other disciplines there sometimes seems to be a ‘rigor’ arrogance. Likewise, other disciplines too readily dismiss some sound economics research because it seldom concerns primary data collection or they ‘summarize’ data incorrectly. It almost feels like a clash of the paradigms. I would say it depends on what you want to find out (research questions). The research questions need to be commensurate with your methodology, and they in turn both need to fit the (extent of) the conclusions. We can learn a lot from each other, and I would encourage disciplines to work together, rather than play ‘we are rigorous and you are not’ or ‘your models are crap’ games. Be critical of both (as I am above, note I’m just as critical about any piece of research without disregarding its strengths), be open to affordances of both (and more disciplines of course), and let’s work together more.

Education Research Math Education MathEd

Slides from researchEd maths and science

Post author By cbokhove
Post date June 11, 2016
No Comments on Slides from researchEd maths and science

Presentation for researchED maths and science on June 11th 2016.

Mathematics: skills, understanding or both? from Christian Bokhove

References at the end (might be some extra references from slides that were removed later on, this interesting 🙂

Interested in discussing, contact me at C.Bokhove@soton.ac.uk or on Twitter @cbokhove

Tags education, maths, skills, understanding

Education Education Research Math Education MathEd

Presentatie researchED Amsterdam (Dutch)

Post author By cbokhove
Post date February 2, 2016
No Comments on Presentatie researchED Amsterdam (Dutch)

Dit is de researchED presentatie die ik gaf op 30 Januari 2016 in Amsterdam. Enkele Engelstalige woorden zijn er in gelaten. Literatuur is aan het einde toegevoegd.

Presentatie researchED Amsterdam from Christian Bokhove