(Mis-)understanding school segregation in England? Comments on a new measure of segregation

Simon Burgess and Rich Harris

 

A new measure of segregation has been proposed by the iCoCo Foundation, School Dash, and The Challenge, which is a charity for building a more integrated society. It appears in the report, Understanding School Segregation in England, 2011-16, where the method, details of which can be found here, has been used to look at ethnic and social segregation between schools in England. The report states,

Across all schools in 2016, 26% of primary schools and 40.6% of secondary schools were found to be ethnically segregated or potentially contributing to segregation by our measure; while 29.6% of primary schools and 27.6% of secondary schools were found to be segregated by socio-economic status, using FSM-eligibility as a proxy (p.13 of The Challenge’s report, emphasis added)

And the first of six recommendations is:

As part of its response to the Casey Review [Casey 2016], the Government should recognise the trends that Casey, ourselves and many others have identified and set a clear direction to reduce the growth of school segregation and to reduce segregation wherever it is at a high level and encourage all agencies to act accordingly, providing advice, support, guidance and resources as appropriate (p.17, emphasis added)

Whilst few would argue against reducing segregation, the assertion that it is growing is contentious. There have been a number of claims that segregation is increasing (for example here). However, the very clear consensus is that ethnic residential segregation fell between the 2001 and 2011 population censuses in England. In regard to ethnic segregation between schools, one of us has shown that the overall trend is downwards (see Burgess here). This difference in conclusion raises two important issues: what do we mean by segregation, and how does this ‘new measure’ difference from more established approaches, potentially affecting its results?

Let’s begin with segregation. It’s an emotive word that conjures up pejorative meaning in the press and in public debate so it has to be used with care and clarity. The widespread academic meaning is simply one of looking to see whether the places where one ethnic group is more likely to be found are also the places where another group is not: to say “segregation is high” is to describe a situation where ethnic groups are spread very unevenly between different schools and largely in different schools from one another. The standard measure of segregation quantifies this unevenness but offers no insight on how it occurs. Segregation is typically conceived as the net outcome of a number of often inter-connected processes, embodying the decisions of different people and institutions, and of the structural constraints upon them such as the operations of the housing and labour markets.

The data and interest to measure segregation grew up in the 1950s, particularly initially in the US with a focus on black-white segregation within cities. Standardised ways of measuring were developed, indices were compared, and their statistical and technical properties established. The most commonly used measure is the Index of Dissimilarity (D index), which measures the extent to which the composition of individual units (such as schools, neighbourhoods or city blocks) differ across a study region. It captures the key part of the definition of segregation, namely separation: it measures the extent to which different groups are found apart in separate housing, separate schools, or separate jobs. This is a tried and trusted measure, with well-understood properties and with a huge back catalogue of comparable results. It can also be decomposed; for example, to gauge the contribution of different types of school to the overall value or even to assess scale effects (is segregation happening at the micro-, meso- or macro-scales?). This is the measure both of us have used to understand patterns of ethnic segregation in England’s schools, including analysis of the levels and trends in ethnic segregation (for example Burgess and Wilson, 2005; Harris, 2017).

So questions arise: why a new measure? What is the added value in basing a measure on a comparison of the school with its local neighbourhood? How does it relate to the Dissimilarity Index, and does it tell us anything new about segregation?

What the new measure tries to capture is the difference between schools and their neighbourhoods on the basis that the ethnic composition of a school ‘ought’ to reflect that of its surrounding neighbourhoods if admissions into that school are geographically determined and unbiased against race, social class, prior attainment, and so forth. This is not the first time that the differences between schools and neighbourhood have been measured (see Burgess et al, 2005; Johnston et al 2006) and the Casey Review makes much of the fact that, “the school age population is even more segregated when compared to residential patterns of living” (p.11) [but that’s most likely a demographic effect and not evidence that the segregation is increasing: see Harris, 2017]. However, rather than looking at the ethnic composition of neighbourhoods directly the measure actually takes a proxy for the ‘local area’ by averaging the characteristics of the nearest 10 other schools to the one under consideration. The assumption is that any school’s intake should be the same as the average of its 10 nearest neighbours.

We will return to that assumption presently. Before doing so we note that this is not a measurement of segregation as widely understood because it no longer considers segregation as an overall outcome. Instead, it is a partial look at the differences between where people live and where they go to school; differences perhaps due to the ways that the school admissions system works but perhaps also due to the school choices people make.

The difference in what is meant by segregation is easy to see with an example. Imagine a city with two ethnic groups who largely live in different zones of the city (residential segregation is very high). There are ten schools in each of the two zones of the city – each zone is entirely mono-ethnic and so are its ten schools. The standard D index would say that school segregation in this city is very high. Half the schools have 100% pupils from one group and the other half have 100% from the other group, so this is maximum separation, maximum segregation. The new measure, however, would say that there was low or zero segregation, with most or all schools having the same ethnic composition as their neighbours and their local area. It seems to us that the standard measure would fit better with how many people would describe the city. It would be peculiar to claim that a very divided city, with very divided neighbourhoods and therefore very divided schools is experiencing no segregation!

What the new measure really means is that there is no additional segregation, once neighbourhood sorting has been taken into consideration. That is an interesting point-of-view, especially if we change the example and imagine two very ethnically mixed zones of a city, within which one particular school nevertheless obtains a very mono-ethnic intake. Such a circumstance would raise questions about the processes of school choice or of admission that led to a locally uneven outcome. However, there are problems with this approach. First, it implies that segregation caused by the school admission process can be measured independent of (having controlled for) segregation from neighbourhood sorting. In practise the two are interrelated: think of the way house prices increase around the most sought after schools. Second, as we have noted, the authors actually measure the ‘local area’ by averaging the characteristics of each school’s 10 nearest neighbours.

The choice of 10 seems very arbitrary and will mean different things in different areas – notably areas of high population density versus those that are sparse. More specifically, it is hard to understand why the best choice is 10 schools for both primary and secondary sectors when there are about 7 times more primary schools than secondary schools. The nature of bunching of schools in urban spaces is likely to mean that some schools are used in a large number of these comparisons; conceivably a single school might be in the “nearest 10 schools” for all the schools in an urban LA. This gives that school a lot of statistical ‘leverage’. If such schools are also unusual in their composition (which an urban centre school might be), then this makes the measure very dependent on those few high-leverage schools. It’s also not necessarily the case that the 10 nearest (by straight line distance) are also the 10 most easily reached. The measure ignores any natural or human made barriers that would prevent the one school recruiting from the same admission spaces as the other ten. It also does not weight for distance, since it is reasonable to suppose that the first closest school should be the most similar and the tenth nearest the least. In fact, the general approach of identifying local comparison groups for schools is not new and other more sophisticated approaches have been used, for example modelling the de facto catchment areas of schools and where they appear to be ‘competing’ or defining the admission spaces of schools in some way to compare school intakes with the composition of neighbourhoods (Harris, 2011, and for a relevant critique of the approach:  Watts, 2013; Harris et al., 2013).

The nature of the measure means that there are a number of specific decisions made in its construction that are questionable and will have significant impacts on its results. Why is segregation defined by a ratio split and an absolute point split? Why is the ratio “half or double”, and are the results of the study robust to other plausible values?  Why is the absolute point split set to 15 percentage points, and again, does it matter?

Two specific issues are highlighted in the report: the role of faith schools and trends in the data. On the former it is well known that faith schools, because they tend to recruit over larger areas (with admission policies that are less geographically determined), because there can be an element of selection (by religious practise) and because different types of faith school can appeal differently to different ethnic groups so they can have intakes that differentiate them from other surrounding schools, sometimes appearing more socially privileged. The role of faith schools and their admission policies is an area of on-going debate (the Prime Minister has recently suggested that the limit on the number of children a school can recruit on a faith criterion will be removed). However, faith schools are a heterogenous group – as a category it masks diversity.

With consideration to the overall trend in school segregation, ethnic segregation in schools is falling overall using the traditional measures. While there are undoubtedly some places for which it is increasing, that can happen when ‘minority’ groups are, on average, younger than the White British and so there are more children or particular ethnicities in particular places. In fact, while not actually mentioned in the report, and standing against some of the slightly alarmist discussion, even on this new measure, the overall trend in school segregation is down. On p. 13, the report states that “For secondary schools, 64 areas saw an increase in the number of segregated schools, whereas 74 saw a decrease (with 12 seeing no change).”  So there was an increase in 43% of areas on this new measure, some of which may be large and some of which may be small.

In summary, local measures and comparisons have a role and are useful. But they do not measure ‘segregation’ as it is usually understood and they are not unproblematic. The authors say that the new measure is “fairer and more accurate”: we would dispute this.

Two final points on an implicit link between this measure and policy ideas. First, we need to be careful about drawing attention to, and making policy based on, any exception to a general rule. We will always find places where segregation is increasing though usually only in the short term due to demographic changes. But if the overall trend is one of decreasing segregation – however measured – then that is the key result that needs to be emphasised. Second, the idea behind the new measure – comparing the composition of a school to its neighbourhood – risks implying some recommendations that might be counter-productive. Taking the base view here that schools should reflect their areas, and focussing for a moment on social segregation – segregation by eligibility for Free School Meals – the seemingly natural implication follows that we should ensure poor children go to school in poor neighbourhoods and affluent pupils go to school in affluent neighbourhoods. This is an anathema to us and is surely destined to reduce social mobility and reduce contact between ethnic groups.

 

References

Burgess, S. and Wilson, D., (2005) Ethnic segregation in England’s schools. Transactions of the Institute of British Geographers, vol 30 (1), pp. 20 – 36

Burgess, S., Wilson, D. and Lupton, R. (2005) Parallel Lives? Ethnic Segregation in Schools and Neighbourhoods. Urban Studies vol. 42 no. 7

Casey L. (2016) The Casey review: A review into opportunity and integration. London: Department for Communities and Local Government

Harris R, 2011, Measuring Segregation a Geographical Tale. Environment and Planning A, 43, 1747 – 1753

Harris R, Johnston R, Jones K, Owen D, (2013), Are indices still useful for measuring socioeconomic segregation in UK schools? A response to Watts. Environment and Planning A, 45, 2281 – 2289

Harris R, (2017), Measuring the scales of segregation: looking at the residential separation of White British and other school children in England using a multilevel index of dissimilarity. Transactions of the Institute of British Geographers, in press

Johnston R, Burgess S, Wilson D, Harris R, 2006, School and Residential Ethnic Segregation: An Analysis of Variations across England’s Local Education Authorities. Regional Studies, 40, 973 – 990

Watts M, 2013, Socioeconomic segregation in UK (secondary) schools: are index measures still useful? Environment and Planning A, 45, 1528 – 1535

 

Celebrating the GCSE Performance of the “Children of Immigrants”

Since 2005, I have shown here, here and here that pupils from ethnic minorities perform well in the crucial GCSE exams at the end of compulsory schooling. In particular, in terms of the progress they make through secondary school, some of the results are very impressive indeed. This is not due to any material advantages that these children have. Instead, the discussion is about aspirations, ambition and attitudes to school. Recently, Education Datalab have showed that in selective areas ethnic minority pupils are more likely to pass the 11+ too.

This post is a short follow-up to comments in a speech from Ofsted’s Chief Inspector, Sir Michael Wilshaw a few days ago. He said:

“And there is another successful aspect to our school system that has largely gone unnoticed. We regularly castigate ourselves – rightly – for the poor performance of white British pupils. Children of immigrants, conversely, have in recent years done remarkably well. … Our schools are remarkable escalators of opportunity. Whatever cultural tensions exist outside of school, race and religion are not treated as handicaps inside them. All children are taught equally.”

This echoes the 2014 analysis I made of London’s educational success, largely driven by the much higher fraction of “children of immigrants” in the capital (36% of pupils in London are White British, compared to 84% of pupils in the rest of the country).

Indeed, a focus on the ‘London effect’ has largely eclipsed the fantastic performance of ethnic minority pupils in the public debate.  In 2014, I said:

 “In this rush to hang on to the effects of a slightly mysterious policy [London Challenge], we are just marching past a demonstrable achievement of London. Sustaining a large, successful and reasonably integrated multi-ethnic school system containing pupils from every country in the world and speaking over 300 languages is a great thing. The role of ethnic minorities in generating London’s premium shows that London is achieving this. How many of those are there? I don’t know enough about school systems around the world to say, but I’d guess it’s probably unique.”

It is worth briefly revisiting the facts on how very well they do.

Of course, there is no data in the National Pupil Database on the immigrant status of children. However, we do have a rough approximation for this: whether English is the language spoken at home, or whether it’s another language. The latter group are said to have “English as an Additional Language”. In order to focus on progress through secondary school in a transparent way, I focus on pupils who all achieved the same level (level 4) in Keystage 2 maths tests (the same results arise if I use Keystage 2 English tests). For those who need such things, a full regression approach is adopted in the papers noted above.

First of all, simply the average performance across a range of different outcome measures of pupils with English as an Additional Language, and pupils for whom English is their first language. These gaps are strongly statistically significant, and very substantial. For example, for the headline benchmark score of percentage achieving at least 5 A*-C grades (including E & M) the gap is 50.5% to 62.9%. This is repeated across all the other measures shown.

GCSE Performance metrics:

Pupils with: Percent achieving at least 5 A*-C grades inc’g E & M Total GCSE Points GCSE grade in Maths GCSE grade in English Number of pupils
English as the First Language 50.48 339.2 4.92 4.93 227429
English as an Additional Language 62.88 358.0 5.39 5.23 24761

Only pupils with level 4 in KS2 Maths. GCSEs coded as A* = 8, A = 7, B = 6, …. U = 0.

(Note that this is data from 2013 GCSEs as that it what I have handy, but I seriously doubt anything has wildly changed).

In fact, if we look in a little more detail, the contrast is even stronger. The difference in performance between the two groups among pupils who are eligible for Free School Meals is huge, 28.5% and 55.8%. That poor pupils with English as an Additional Language score better than non-poor pupils with English as their First Language is indeed a remarkable achievement.

The percentage achieving at least 5 A*-C grades (including E & M):

  Eligible for Free school meals?   Gender
Pupils with: No Yes   Female Male
English as the First Language 53.7 28.5   60.9 39.1
English as an Additional Language 65.2 55.8   74.0 51.2

 

As Sir Michael Wilshaw rightly says, we urgently need to address the low GCSE attainment of poor White British pupils. But we should not let that stop us from celebrating the joint success of the “children of immigrants” and England’s education system.

 

Using behaviour incentives to raise GCSE attainment in poor neighbourhoods: Evidence from a large-scale field experiment

Many countries struggle with a long tail of low attainment in schools.  This blights individual lives and represents lost output for the economy as a whole. Low attainment is also typically associated with particular socio-economic backgrounds, and growing up in poor neighbourhoods, which strengthens the persistence of disadvantage.  Increasingly, governments are turning to new ideas in an attempt to deal with this problem. One of these is the potential for incentives to change behaviours in schools.

Our new study shows that these have powerful positive effects on GCSE scores for many pupils wiping out about half of the disadvantage attainment gap in secondary schools. This effect is concentrated on low-attaining pupils, with no effect on high attainers.

Why are incentives needed? There is an obvious and substantial incentive for good performance in school – studying hard will earn good qualifications, which will bring a good income, better health, longer life expectancy, and higher self-reported well-being. For students who have already internalized the inherent incentives for working hard in school, additional rewards may add little further motivation.

But there are many places where that argument can break down for some pupils. Pupils may not really know what is needed to achieve high grades; they may misunderstand the importance of effort (rather than say innate ability or parental resources); they may believe that qualifications will not help them; they may lack the facilities to study and the motivation to secure them; or they may only really care about now.  Such students may be responsive to short term incentives for effort. So it seems likely that there will be diverse responses to incentives: powerful for some, irrelevant for others who are already well motivated.

We set up a large scale field experiment involving over 10,000 pupils in 63 schools to test the impact of incentives. We recruited schools in the poorest decile of neighbourhoods in England. The experiment was funded by the Education Endowment Foundation, to whom we are very grateful.

We incentivised inputs (effort and engagement), not outputs such as test scores. These were repeated, immediate rewards incentivising effort and engagement at school. The pupils involved were in year 11, the final year of compulsory schooling leading up to the very high stakes GCSE assessments.  The incentives are based on conduct in class, working well during class, completing homework, and not skipping school. We compared a cash incentive and a non-financial reward: a high-value event determined jointly by the school and students.  The cash incentive offered up to £80 per half-term (for a total of £320 over the year).  This might sound a lot, but at the youth minimum wage some of them would be earning a few months later, it works out at less than an extra hour per week-night. All the details are in our paper.

The experiment has yielded promising new results. In fact, our paper is the first to test the use of behaviour incentives for high-stakes tests, and the first to compare financial and non-financial rewards over the timescale of an academic year.

Behaviour was incentivised in classes for GCSE Maths, English, and Science. Our hope was that improved effort and engagement would raise GCSE scores, even though the scores themselves carried no rewards. The overall impact of the incentives on achievement is low, with small, positive but statistically insignificant effects on exam performance.  However, that small effect is an average of the effect on two groups. There are pupils who “get” the inherent incentive in education, have no need of further encouragement and for whom we would expect zero effect; and there are pupils who don’t, and who might well be affected by a more immediate and obvious reward.

We use statistical techniques to identify these two groups using the rich data on pupils available in the National Pupil Database. In fact, at least half of the pupils have economically meaningful positive effects, principally but not only for the cash incentive. We find that this has very substantial and statistically significant effects in Maths and Science. In the metric researchers use to compare results across studies, the Maths GCSE score increased by 16% of a standard deviation (SD), and the Science score by 20% of an SD. Education researchers will know that these are very large effects. Another comparator is that this has equivalent effect to a very substantial improvement in teacher effectiveness (one SD).

The best way of gauging the impact of the intervention is that for this group it is more than half of the impact of poverty (eligibility for free school meals, FSM) through secondary schools . This is worth emphasising: a one-year intervention costing around £200 – £320 per student eliminates half of the FSM gap in Maths and Science GCSE scores in the poorest neighbourhoods.

Of course, to repeat, there are pupils for whom this intervention has no effect, pupils who are already putting in a huge effort at school.

So who are these groups? The Figure below shows very clearly that the impact is on low attainers. The graph focusses on the effects of the cash incentive in Maths GCSE. Among pupils with low predicted GCSE scores, pupils in the intervention group scored substantially more than in the control group. That’s not true among pupils expected to do well, the incentive makes little difference. In this sense, this intervention is perfectly targeted: unlike other interventions the group getting the most out of it are the policy-focus low attainers, not those already doing well.

maths-t1

We also analysed the impact of the incentives on summary measures of GCSE performance, including passing the 5A*C(EM) threshold. For the low attainers, this increased significantly by around 10 percentage points, and not at all for the high attainers. This is important because of the very high earnings penalty to not reaching that benchmark, estimated at about 30% of earnings.

To summarise, we offered pupils incentives to raise their effort and engagement at school. This had very substantial effects on the Maths and Science GCSE performance of half of the pupils. This impact was high enough to wipe out half of the FSM attainment gap, and is concentrated on low attaining pupils. We ran the intervention as a randomised controlled trial for over 10,000 students in 63 schools in the poorest neighbourhoods of England. This seems to offer some very promising leads for schools and for policy makers. I’ll write more about that in my next post in a few days’ time.

 

Two points on league tables for Multi Academy Trusts

 

I think school performance tables play a valuable role, providing a channel for school accountability and also informing parents’ choice of school. Our research shows that their removal in Wales for a decade from 2001 significantly reduced average pupil progress and widened inequality.

So performance metrics, “league tables”, for school groups are likely to be useful too. Today, by some wildly improbable coincidence, three different studies are released providing these for Multi Academy Trusts (MATs) and other groups including Local Authorities. These come from (in alphabetical order), the DfE, EPI and the Sutton Trust; and they all contribute a lot to the debate.

These are still early days in the development of this methodology and no doubt more work will be done to refine the analysis, continuing a DfE working paper here. But if the legislation/determination/plan/desire/hope/vague preference goes ahead for all schools to join MATs, then they will become increasingly important. These initial exploratory papers are then also very important.

I want to make two quick points about the eventual form that MAT league tables might take. Both flow from the uncontroversial point that school groups are different from schools.

First, all three reports focus centrally on the average performance across the MAT: take an outcome measure and average it across all the schools in the group. There is discussion about the choice of performance metric, and the base year and so on. The central concern is a desire to not penalise MATs taking on low-performing schools. There is less discussion about the implications of there being many schools in a MAT.

But if the average is the sole focus of the performance tables then there are significant dangers. Schools (and groups) respond to what is measured. If the only thing measured is the group average then there may be a temptation for the MAT to prioritise some schools in the group at the expense of others. This might well raise the group average at the expense of one particular school. This prioritising might include channelling resources and assigning the most effective teachers. This could leave some schools and communities badly served. And crucially this would be invisible if the MAT average was all that was published.

So it seems to me imperative that MAT performance tables must also include reporting the minimum performance as well as the average. This might be the performance of the lowest performing school in the group.  And publish the max too if you wish.

(Of course, the same argument applies to the current league tables: that they focus on the average across pupils in the school (although, briefly, they didn’t)).

Second, some chains are local, others are more geographically spread. Both configurations have positives and both have problems, for discussion another day. For parents choosing schools, there is only limited value in knowing the national or regional performance of a MAT, because chances are there is only one “branch” of that MAT you can reach, and that is the key information you need. So group level MAT performance tables can only be part of the answer; if they were all that was published, parental choice would be manifestly less well informed.

For those of you still reading, the obvious answer to both points is: publish school level tables alongside MAT level tables. Indeed; and ideally they would be published in an integrated way that was both comprehensive and comprehensible. Someone must know a 14 year old web wizard with a strong aesthetic sense.

But it seems that that may not happen. Some have suggested that the White Paper implies that schools in MATs cease to exist as separate entities. In that case, performance tables at “school” level are simply not feasible. This seems a very retrograde step and strongly undermines accountability.

As the debate around setting up MAT performance tables matures, we must ensure that these do not provide inappropriate incentives to MATs, and that they support not undermine parental choice.

Interpreting the numbers of school admissions – is the first preference offers rate too high?

Figures were released yesterday showing how many families received an offer from their first preference school. The headline number was 84% for secondary schools: that is, 84% of families were offered a place in the school that they put top of their application form. The coverage yesterday mostly focussed on the overall supply of school places. Maybe there is also a sense that “only” 84% got their first choice.

Acknowledging the disappointment or worse that individual families will feel at missing out, how should interpret that 84%? It’s obviously affected by two things, both the choices made, and the number of places available. ‘Demand and supply’ if you like – and that’s not inappropriate here as the school admissions algorithm is what acts as the market clearer.

Imagine a country a bit like England, but much simpler, more abstract. And also imagine that all that families care about in schools is their academic quality. In that country, as in England, 79% of schools are rated as Outstanding (26%) or Good (53%). If all schools are about the same size, then 79% of school places are in Outstanding or Good schools. Suppose that everyone can access at least one Outstanding or Good school, and that there everyone applies to one of those schools. If schools and people are spread around the country in a reasonably even and regular way, then 79% of them will get in and the remainder will be offered places in schools less than Good.

In that situation, 84% getting their first choice seems ok.

But what if everyone was a bit more ambitious in their choice, and everyone put an Outstanding school as their top choice? Why not?

Only 26% would get their first choice. Suddenly, in this abstract, regular country, 84% seems to suggest a lot of unambitious choices, ones that are likely to succeed rather than a ‘true’ first choice.

Of course, our country is not simple and regular like that. There is geographical clustering of Outstanding schools, so some families will face a much higher chance of getting into an Outstanding school. In other places, there may be no Outstanding schools, so even families looking for the very best school academically can only out a Good school top, and they too have a pretty high chance of getting that. So taking that into account, even “everyone chooses an Outstanding school” would result in more than 26% getting their first preference, but not as high as 84%.

So I think we can say that the 84% might be too high for comfort. Maybe it reflects a school admissions system that favours those who can buy access to the best schools through their home’s location. The key role of proximity in resolving who gets into the popular schools keeps many of the Outstanding schools out of the choice set of poorer families. We need to change this, as I discussed here.

Part of the solution to the teacher shortage?

 

What can we do about teacher recruitment? Yesterday’s National Audit Office Report has provided some useful clarity and confirmed that the Department for Education has missed its recruitment targets for four years now. A number of commentators have highlighted the shortfall of new teachers through the year, though the DfE has argued to the contrary. The range of views on the shortfall suggests the gap may be between zero and 18% (quite a range!)

A number of suggestions for policy responses have been made: pay all teachers more; pay some teachers more; review the teacher workload; reform the Ofsted process to make it less stressful. These may well be great ideas; but they are all expensive or very expensive, and slow or very slow to implement.

Here’s another idea, which could contribute to raising recruitment, though undoubtedly would not wholly solve the problem. This is more or less costless (apart from the cost of the extra teachers obviously) and easy to implement.

We have imposed a major but pointless restriction on the pool of potential teachers – we can just drop that restriction. We can, effectively, stop shooting ourselves in the foot.

The point is this: there is a general view threading through the teacher recruitment system that applicants with better degrees will make better teachers. I’ll illustrate that in a moment. But all the statistical evidence we have on teacher effectiveness says that that is not true: a teacher’s ability to raise the attainment of her pupils is unrelated to her own academic qualifications.

There are a number of explicit points in the system in which the boundary between getting a II.1 degree and a II.2 degree is crucial. I would argue that these create a mindset on appropriate qualifications for good candidates that pervades the system much more widely. For example, in terms of bursaries for teacher training, these are only available for people holding a II.1 or better in some subjects. The official ‘Get into Teaching’ website makes this clear. This is not true in all subjects: for sciences, maths and languages, the applicant’s degree class makes no difference to the bursary. So I repeat that this proposal would have only an indirect effect on those subjects.  Another example is Teach First, which requires a II.1 or better for its applicants.

So while the II.1 restriction certainly does not apply universally for all teacher recruitment, it is likely to have a much broader impact on the views of recruiters and selectors on what a good teacher looks like.

Over the last decade or so, economists have focussed a lot of research effort on teacher effectiveness. The research evidence shows clearly that teacher effectiveness is unrelated to the teacher’s own academic qualifications. Teachers who themselves got a First class or a II.1 degree are no more effective teachers than those who got II.2s. The NAO Report hints at this too.

The one study for England that measures this (our own) makes this point. The much more numerous research studies in the US show this (see my review here). Among researchers, it is an uncontroversial finding. Even researchers who set out to show that having a Master’s degree should help, end up finding it doesn’t (Ladd and Sorensen, 2015, reference in here).

So the explicit or implicit restriction of teacher recruitment to those getting at least II.1s is pointless – it does not achieve its aim of raising average effectiveness. But it is harmful, it restricts the hiring pool significantly. At the risk of repetition: this is not about a quantity – quality trade-off in hiring – by relaxing this constraint we can seek more quantity at no cost in quality.

How big a difference might this make? It’s very hard to say at a high level of generality. The NAO commented that uncoordinated data sources make this a difficult area to track.

Reaching for a very small envelope, turning it over and starting to scribble: the percentage increase in recruits is equal to the percentage increase in the hiring pool times the relative likelihood of applying from the new group times the relative likelihood of someone in the new pool being acceptable. If we assume the current hiring pool is all with a II.1 or better, and we are proposing to expand this to include people with II.2s as well, using HESA data (chart 9) that’s an increase of 35%. Of course there are other routes in as well as from the flow of new graduates, as well as people from outside the UK, so let’s call it 30%.

We know that students with II.2 degrees have lower rates of return than those with higher classes, so presumably face worse alternative job opportunities. They might therefore be more likely to apply to teaching posts. But to be cautious, and underestimate the likely effect, I assume a relative application rate of 1. For relative acceptability, I need to account for the fact that in some subjects, applicants with II.2s are already accepted. I will also assume that the marginal II.2 candidate is less acceptable than a II.1 candidate. Overall, let’s try a relative acceptability rate of 0.4.

Multiplying these numbers together (0.3 x 1.0 x 0.4) yields a potential increase in recruits of 12%. The previous paragraphs make clear how very rough an estimate that is, but you can choose your own numbers to try. So it might be that this proposal might increase recruits by around 10%, and recruits of the same expected effectiveness at teaching.

Why not remove the II.2 restriction? And work to counteract the view that teachers with II.2s will be ineffective teachers. It won’t reduce average teacher effectiveness and it will increase the applicant pool.

Just to finish, it is worth re-emphasising that while teacher numbers are important, much more important is the average effectiveness of teachers. All the evidence shows that being taught by an effective teacher relative to an ineffective teacher has a dramatic impact on attainment. To illustrate: having all effective teachers relative to all ineffective teachers for just one GCSE year wipes out half of the poverty gap in attainment. Getting more teachers into our classrooms matters, but understanding how to raise average effectiveness is the big prize.