Tag Archives: Simon Burgess

Why the new school league tables are much better … but could be better still

Rebecca Allen (IOE) and Simon Burgess (CMPO)

Tomorrow the new school league tables are published, with the usual blitz of interest in the rise and fall of individual schools. The arguments for and against the publication of these tables are now so familiar as to excite little interest.

But this year there is a significant change in the content of the tables.  For the first time, GCSE results for each school will be reported for groups of pupils within the school, groups defined by their Keystage 2 (KS2) scores. Specifically, for each school the tables will report the percentage of pupils attaining at least 5 A* – C grades (including English and maths) separately for low-attaining pupils, high attaining pupils and a middle group.  This change has potentially far-reaching implications, which we describe below.

This is a change for the better, one that we have proposed and supported elsewhere.  Why? We believe that in order to support parents choosing a school, league tables need to be functional, relevant and comprehensible. The last of these is straightforward (though not all league table measures in the past have been comprehensible: Contextualised Value-Added (CVA) being the perfect example). ‘Relevant’ means that a measure has some relevance to the family’s specific child. A simple school average, such as the standard whole-cohort % 5 A* to C, is not very informative about how one specific pupil is likely to get on there. By ‘functional’ we mean a measure that does actually help a family to predict the likely GCSE attainment of their child in different schools. If a measure is not functional it should not be published at all.

The new group-specific component is comprehensible and is more relevant than the whole-cohort %5 A* to C measure.  In our analysis of functionality, we show that it is as good as the standard measure, and much better than CVA.

It also addresses in a very straightforward way the critique of the standard league tables that they simply reflect the ability of the intake into schools, and not the effectiveness of the school.  By reporting the attainment of specific groups of students of given ability, this measure automatically corrects for prior attainment, and in a very transparent way. This is therefore much more informative to parents about the likely outcome for their own children than a simple average.  This of course is what value-added measures are meant to do, but they have never really become popular, and as we show they are not very functional.

However, the details of the new measure now published are problematic in one way. The choice of groups is important. We defined groups by quite narrow ten percentile bands, the low attaining group lying between the 20th and 30th percentiles in the KS2 distribution, the high attaining group between the 70th and 80th percentiles, and the middle group between the 45th to 55th percentiles. While clearly there is still variation in student ability within each band, it is second order and the main differences between schools in performance for any group will come from variation in schools’ teaching effectiveness.

However, the DfE has chosen much broader bands, and have defined the groups so that they cover the entire pupil population: the low attaining group are students below the expected level (Level 4) in the KS2 tests; the middle attaining are those at the expected level, and the high attaining group comprises students above the expected level.

This has one significant disadvantage, set out in detail by Rebecca Allen here. The middle group contains around 45% of all pupils, and so there is very significant variation in average ability within that group across schools. This in turn means that differences in league table performance between schools will reflect differences in intake as well as effectiveness, even within the group, thus partly undermining the aim of group-specific reports.

The chart below illustrates this for the middle attainment group (see here for more details).  Each of the three thousand or so tiny blue dots shows the capped GCSE attainment for a group of mid-attaining pupils (on the DfE’s measure of achieving at the expected level at KS2) against the average KS2 score (i.e. prior attainment) of pupils at the school. The red dots plot the same relationship for our narrow group of middle attainers (the 45th to the 55th percentile). The chart shows very clearly that the performance among our narrow band is essentially unrelated to prior attainment, but the DfE measure for the very broad group does still favour schools with higher prior ability pupils.

We can speculate as to why the DfE chose to have much broader groups. There may be statistical reasons, pragmatic reasons or what can be termed “look and feel” reasons. Using narrow KS2 bands will correctly identify the effectiveness of the school, but will almost always be averaging over a small number of students. So the estimates will tend to be “noisy”, and induce more variation from year to year than averaging over bigger groups. The trade-off here is then between a noisy measure of something very useful against a more stable measure of something less useful. Our original measure was intended to balance these, the DfE have gone all the way to the latter.

A pragmatic reason is that some schools may not have any pupils in a particular narrow percentile band of the KS2 distribution. The narrower the band the more likely this is to be true. This would mean either null entries in the league tables, which might be confusing, or some complex statistical imputation procedure, which might be more confusing. The broad groups that cover the entire pupil population are likely to have very few null entries. Finally, the broad groups feel more ‘inclusive’, they report the performance of all of a school’s students. This is a red herring – the point of the tables is to inform parents in choosing a school, not to generate warm glows.

The new measures hold out the promise of improvements in two areas: choices by parents and behaviour by schools. Parents will have better information on the likely academic attainment of their child in a range of schools. Second, parents will be able to see more directly whether school choice actually matters a great deal for them: whether there are worthwhile differences in attainment within the ability group of their child.

The key point for schools is that performance measures have consequences for behaviour. If this new measure is widely used, it will give schools more of an incentive to focus across the ability distribution. It is still the %5 A* – C measure that is the focus of attention for each group, but now schools will have to pay attention to improving this metric for high and low ability groups as well as simply the marginal children with the highest chance of getting that crucial fifth C grade.

If one believes that gaming and focussing of resources within schools is a very big deal (and there is little quantitative evidence either way) then the new measures could have a major impact on such behaviour. Even if such resource focussing is second order, performance measures send signals on what is valued. These new league table measures will explicitly draw widespread media and public attention to the performance of low- and high-ability children in every school in England.

A Report of Two Halves

Simon Burgess

We published some research last Friday showing that students perform less well in their crucial GCSE exams in years when there is a major international football tournament taking place at the same time. For example, the FIFA World Cup in the summer of 2010, or the UEFA European Championship next summer, both overlap in part with the GCSE exam timetable.

With the draw for the groups in the European Championship taking place earlier that day, much of the comment naturally and sensibly focussed on the specific issue of the impact of next year’s tournament on exam scores. This is important: we estimate that the concurrence of the exams and saturation media coverage of the football reduces exam scores on average by around 0.12 standard deviations of pupil performance and by a lot more for some groups who reduce their effort a lot. These groups tend to be from poorer areas and predominantly (but by no means exclusively) male students.  Since these groups are already lower-performing groups, this means that education gaps will widen. We think of this impact arising through a reduction in student effort, with that time being spent instead on watching the football tournament. The variation in impact arises because of differing tastes for football, arising in turn from cultural norms and idiosyncratic factors, and from the differential effectiveness of an hour of study on exam performance.

However, there is also a broader significance to the research: finding that effort matters matters.

Recent research by economists has broadened out from the previous focus on cognitive ability, and a great deal of work has investigated the role of non-cognitive factors in educational attainment. Non-cognitive factors can be identified with personality traits (see Heckman), and one of the ‘big 5’ personality traits is ‘conscientiousness’, with the related traits of self-control, accepting delayed gratification, and a strong work ethic. Conscientiousness has been shown to be an excellent predictor of educational attainment and course grades. These aspects of self control and ability to concentrate are clearly related to the broad notion of effort we are using here. Our results on the importance of effort strengthen this evidence by isolating the effect of decisions on effort and time allocation in addition to the general ability to concentrate and exert self-control.

There is a great deal of policy interest in England arising from recent studies of US Charter schools with what is called a “No Excuses” ethos. This includes the KIPP (Knowledge Is Power Program) network of schools and schools in the Harlem Children’s Zone. These schools all feature a long school day, a longer school year, very selective teacher recruitment, strong norms of behaviour, as well as other characteristics. Some of the profession’s very top researchers have produced evidence showing that such schools produce very powerful positive effects on student achievement. While this overall effect could be due to different aspects of the KIPP/HCZ ethos, part of it is very likely to be increased effort from the students. Our results complement this by showing the impact of just a change in effort, and that that can have very substantial effects.

This matters for a number of reasons. First, unlike genetic characteristics, cognitive ability or non-cognitive traits, effort is almost immediately changeable. Our results suggest that this could have a big effect. The fact that we find changes in student effort to be very potent in affecting test scores suggests that policy levers to raise effort through incentives or changing school ethos are worth considering seriously. Such interventions would be justified if the low effort resulted from market failures due to lack of information on the returns to schooling, or time-inconsistent discounting.  Second, the importance of a manipulable factor such as effort for adolescents’ educational performance provides evidence of potentially high value policy interventions much later than “early years” policies. This is encouraging, offering some hope that low performing students’ trajectories in life can perhaps be effectively improved even after a difficult environment early in life.