This paper has the title “Is traditional teaching really all that bad?” and is by Schwerdt and Wuppermann makes clear that this paper sets out to show it isn’t. And without this paper I would have said the same thing. Simply because I wouldn’t deny that ‘direct instruction’ has had a rough treatment in the last decades.
There are several versions of this paper on SSRN and other repositories. The published version is from ‘Economics of Education Reviw’, and this immediately shows why I have included it. In the advent of economics papers some have preferred to use this paper rather than a more sociological, psychological or education research approach.
The literature review is, as often the case in my opinion in economics papers, a bit shallow. The study uses TIMSS 2003 year 8 data (I don’t know why they didn’t use 2007 data).
I find the wording “We standardize the test scores for each subject to be mean 0 and standard deviation 1.” a bit strange because the TIMSS dataset, as in later years, does not really have ‘test scores per subject’ because subjects do not make all the assessment items.
(link)Instead, there are five so-called ‘plausible values’. Not using them might underestimate the standard error, which might lead to results being significant more swiftly. This variable is the outcome, another variable is the question 20.
The distinction between instruction and problem solving are based on three of these items: b is seen as direct instruction, c and d together problem solving (note that one of course does mention ‘guidance’). There is an emphasis on ‘new material’ so I can see why these are chosen. Of course the use of percentages means that an absolute norm is not apparent, but I can see how lecture%/(lecture%+problemsolving%) denotes a ratio of lecturing. The other five elements are together used as control. Mean imputation was used (I can agree that imputation method probably did not make a difference) and sample weights (also good, contrary to no plausible values).
Table 1 in the paper tabulates all the variables and shows some differences between maths and science teachers, for example in the intensity of lecture style teaching. The paper then proposes a model “standard education production function”. In all the result tables we can certainly see the standard p=.10 and again with large N’s this, to me, seems unreasonable. A key result is in Table 4:
The first line is the lecture style teaching variable. Columns 1 and 3 show that Math is significant (but keep in mind, at 5% with high N. However, 0.514 does sound quite high) and Science is not. Columns 2 and 4 then have the same result but now by taking into account school sorting based on unobservable characteristics of students through inclusion of fixed school effects. I find the pooling a bit strange, and reminds me of the EEF pooling of maths mastery for primary and secondary to gain statistically significant results. Yes, here too, both subjects then yield significant results. Together with the plausible values issue I would be cautious.
Table 5 extends the analysis.
The same pattern arises. The key variable is significant at the questionable 10% level (column 1) and a bit stronger after adding confounding variables (at the 5% level, but again with high N). The articles notices that over the columns the variable is quite constant, but also that it’s lower than the Table 4 results, showing that there are school effects.
There is footnote on page 373 that might have received a bit more attention. I find the reporting a bit strange because the first line indicates that variable ranges from 0.11 to 0.14, not 0.14 to 0.1 (and why go from a larger to a smaller number, is this a typo?). Overall, 1% of an SD seems very low. I think the discussion that follows is interesting and adds some thoughts. I thought it was interesting that was said “Our results, therefore, do not call for more lecture style teaching in general. The results rather imply that simply reducing the amount of lecture style teaching and substituting it with more in-class problem solving without concern for how this is implemented is unlikely to raise overall student achievement in math and science.”. Well, that does seem a balanced conclusion, indeed. And again, a strong feature for most economic papers, the robustness checks are good.
In conclusion, I found this an interesting use of a TIMSS variable. Perhaps it could be repeated with 2011 data, and now include all five plausible values (perhaps a source of error). Nevertheless, although I think strong conclusions in favour of lecturing could be debated, likewise it could be said that there also are no negative effects of it: there’s nothing wrong with lecturing!