Tag Archives: Education Policy

Teacher performance pay without performance pay schemes

Author:  Simon Burgess

Teacher performance pay without performance pay schemes

Amid the macroeconomic gloom, the Autumn Statement contained a line about teachers’ pay. The School Teachers’ Review Body recommends “much greater freedom for individual schools to set pay in line with performance”. Consultations and proposals are expected in the near future.

But simply giving schools the freedom to do this may be a rather forlorn hope of anything much happening. It is not clear that there is a substantial demand from schools for performance-related pay (PRP) schemes that has only been thwarted by bureaucratic restrictions. It is hard to see high-powered, tough-minded PRP schemes being introduced by more than a handful of schools, not least because we have not seen large scale deviations from national pay bargaining in academies in England despite their new freedoms to do so.

If that path seems unpromising, there are other ways of facilitating a greater reflection of performance in pay, discussed shortly. But first – is PRP for teachers a good idea in the first place? Does it raise pupil attainment? What are the ‘side effects’?

This is a question that economists have produced a good deal of research on. And to summarise a lot of diverse work briefly, the international evidence is mixed. Those on both sides of the argument can point to high quality studies by leading researchers that find substantial positive effects, or no effects. In both cases, interestingly, there appeared to be little evidence of gaming or other unwanted effects of the incentives.

There is little evidence specifically for England. Our own research found a substantial positive effect of the introduction of a PRP scheme, but given the varied results found elsewhere it would seem unwise to place too much weight on this one study. The underlying performance pay scheme was poorly designed but nevertheless had a positive effect on the progress of pupils taught by eligible teachers relative to ineligible ones.

And design is key. There are many reasons why a simple high-powered incentive pay scheme might be detrimental to pupil progress, which we have discussed here and here. These include the fact that teachers have multiple tasks to do, the problems of measuring the outcomes of some of those tasks, the complex mixture of team and individual contributions, and the potential impacts on implicit motivation. The overall message is that incentives work, but schemes have to be very carefully designed to achieve what the schemes’ proponents truly intend.

There is another way to facilitate a closer link between pay and performance that does not require any school to introduce a performance pay scheme.

Published performance information in a labour market can change the way that the market rewards that performance. The critical features are first that the organisation’s own output depends in an important way on this performance characteristic of an individual; second that the organisation has some discretion in the pay offers it can make to new hires; and thirdly that the performance information is public – is available and verifiable outside the current employer. In this case, the pay structure of the market will reflect the performance rankings: high-performing individuals will be paid more.

In teaching, the first two of these three conditions are met: teacher quality matters hugely for schools, and schools have some discretion over pay. Now, suppose we had a simple, useful and universal measure of each teacher’s performance in raising the attainment of her pupils (obviously we don’t at the moment; I come back to this below), and that this was published nationally, primarily for the attention of Headteachers. The idea is that Headteachers trying to improve the attainment of their pupils would be on the look-out for high performing teachers when they had a vacancy to fill. Armed with this performance information, they might try offering a higher wage (or something else – it doesn’t have to be money) to tempt them to join their own school. Equally, the teacher’s current school may respond by raising the offer there.  Over time, this process will tend to raise the relative pay of high-performing teachers relative to low-performing ones, whom no-one is trying to bid for.

This idea should not be a strange one. A number of professions have open measures of performance. Just today it is reported that performance measures for more surgeons will be made public in the summer of 2013; this is already true for heart surgeons.

It is well-known that PRP does two things: it motivates and it attracts. The outcome for pay described here will tend to make teaching more attractive to people who are excellent teachers and less attractive to those who aren’t.

There are a number of problems with this idea, though perhaps less than might appear at first glance.  First, it could be argued that a performance measure derived from teaching in one school is not relevant to teaching in another school. Obviously each child and each school is unique, but it seems very unlikely that there is no commonality of context between one school and the next. Observation suggests this: teachers moving from one school to another are not counted as having zero experience, and Headteachers are often appointed from outside a school.

Second, there might be a fear that the teacher labour market would become chaotic, with everyone churning around from school to school in search of a quick gain. We have to recognise that there is substantial turnover of teachers now < http://www.bristol.ac.uk/cmpo/publications/papers/2012/wp294.pdf >. But the main point is that it does not require much actual movement to make the market work. Schools can make counter offers to try to retain their star teachers and the end result is the same – higher salaries for high-performing teachers.

Third, any measure would be noisy, partial and imperfect. Of course, all such measures are. Whether a measure is perfect is not really the question, the question is how noisy and imperfect is it, and whether it contains enough information to be useful. One advantage in this case is that the consumers of these performance indicators are the people best able to judge their usefulness and their shortcomings: Headteachers. If such metrics are not useful, Headteachers will simply ignore them; there would be no compulsion to use them.  Even in labour markets with some of the most detailed and finely measured performance indicators (for example, football or baseball) there are many moves between employers that do not work out. It is worth re-emphasising that these performance measures are bound to be imperfect and incomplete, but broad measures of performance may nevertheless be very useful.

There are useful parallels to be drawn from another profession: academics. For academics, the combination of very detailed and public performance information and a context where research performance matters a great deal to universities seems to have had a substantial effect on academics’ pay.

The Research Assessment Exercise (RAE) and more recently the Research Excellence Framework (REF) have made a strong research performance very important to a university’s standing and its income. But the critical factor for academics is that an individual’s research performance is public knowledge, through very detailed recording of the impact of their research papers. Departments and universities aiming to improve their ranking seek out star researchers and attempt to bid them away with higher salaries (plus other things such as research facilities). These offers may well be matched by their current employer, but the end result is that salaries now seem to be much more closely correlated with research productivity than before the RAE/REF (I say “seem” as there does not appear to be any evidence on this, so this is casual empiricism). This is a lot of what drives many young researchers to put in very long work hours: having a paper published in a top scientific journal early in a career has a substantial lifetime payoff even in a world with few or low-powered incentive schemes. If you check out academics’ websites you will invariably see their academic output prominently displayed.

Again, an important feature is that these indices of research output are largely consumed by other academics who are aware of their strengths and weaknesses. So although they are far from perfect, they are used by precisely the people best placed to calibrate their usefulness appropriately.

If we are to go down a path of tying teacher pay more closely to performance, and yet respect the rights of increasingly autonomous schools to determine their own pay systems, then this might be an option to consider.  The challenge is to devise a measure that is simple, useful and universal. It would measure the progress made by the pupils that teachers taught, it would have to deal with normal variations in performance by averaging over a number of classes and a few years, and be on a common metric.  This is not straightforward, but if it gave rise to a robust broad measure of performance it could form a part of performance pay for teachers, and performance management more broadly. It could also have substantial effects on the pay of high-performing teachers.

Reforming teacher training

Rebecca Allen and Simon Burgess

This week the House of Commons Education Select Committee published its report on the teaching profession. This post gives the main points of our evidence to the Committee.

We think of Initial Teaching Training (ITT) as encompassing both the initial training and the probationary year. How should this be set up to produce the most effective teachers who will have the greatest impact on pupil progress? ITT plays two roles for the profession – training and selection with the emphasis typically placed on the former. Both are important and neither should be neglected, but we argue that the evidence suggests that if anything, selection is the more important, and this is our focus here. An important role for selection is completely standard for any professional accreditation system in either public or private sectors.

The key argument is this: the sharpest selection should be made at the point when the evidence on ability is strongest. The final decision on who can become a teacher should be made when we have accumulated enough evidence on the candidate’s teaching effectiveness. Where is this point in teaching? The two central relevant facts are that variations in teacher effects on pupil progress are very substantial, and that the future effectiveness of a potential teacher is hard to judge from their own academic record.

We believe that the current operation of selection in ITT (tight at the beginning, negligible thereafter) is the wrong way round. Instead, we should let a broader group try out to be teachers, but enforce a much stricter probation policy based around measures of teacher effectiveness in facilitating pupil progress. Full certification and an open-ended first job would only be granted once performance data showed a teacher to be effective. The expectation would be that only the most effective teachers would make it through to full certification.

Selection into ITT is about gaining a place on a course. The difficulty faced in identifying people likely to be good teachers is very relevant here. It is very hard to tell who will be a good teacher and therefore a high degree of agnosticism would be appropriate when faced with applicants. This is certainly true for selection based on objective criteria from the applicants’ own academic records. We know that these are unrelated to teaching ability, and so should be irrelevant in selection into ITT. Beyond that, even if selectors are highly skilled at spotting potential, and it is not clear that they are, it is impractical to ask each applicant to teach a practice lesson. Therefore, selection into ITT should be very broad, with a relatively low academic entry requirement.  This of course is not the situation now, nor the direction of travel of current policy. The tightening of academic entry requirements into teaching is not helpful: it will restrict the quantity of recruits and have no impact at all on average teaching effectiveness.

Graduation from ITT should also be tough. Given that much of an ITT course is now school-based, time spent in the classroom will form an important part of the assessment. Arguably the classroom experience is the key part of the course. However, in such a short space of time it will not generate sufficient data for a robust and objective view of the trainee’s effectiveness. It will nevertheless allow the trainee to discover whether teaching is for them.

Once in a job in a school, the progression to being a qualified teacher should be very different to the typical experience now. The key decision on final certification should be made after a probation period of say three years and ideally, the probation should involve classes of varying ability and year group. The period probably cannot be less, though the appropriate length of the probation would need to be analysed properly, depending on the statistical reliability of any pre-hire indicators, school-based performance data, and the cost of being wrong. This is the point when enough data is available to make a reliable judgement on the effectiveness of the teacher. There should be an expectation that not all will make it through to final certification, and indeed only the most effective should be retained. The key judgement should be a minimum threshold of progress that the probationer’s pupils make. Obviously, the measurement of that progress and the parameters of the threshold require a great deal of careful work. Like any statistical data, estimates of teacher effectiveness will never be perfect, and a good deal of evidence over a number of years will be necessary to reach a decision, but this is clearly necessary to raise the average effectiveness of the teaching profession in England.

Another innovative route into teaching is through Teach First. In some ways this is a positive development, as it allows a lot of people to try out teaching and also gives the schools which employ them an ‘out’ at the end of the two years. On the other hand, it restricts entrants based on their academic background.

It is important to see the teacher labour market as a whole, and to see how the different stages of a teacher career fit together. It seems to be very hard to fire ineffective teachers. While the regulations on this have recently changed, generating a culture that encourages headteachers to take a more proactive stance seems harder. While this may change, it may be that the best way to reduce the problem of low-performing teachers is to make it very difficult for ineffective teachers to get into the profession in the first place.

These changes would make starting out on a teaching career much more risky financially. In order to maintain the same average lifetime expected income from the profession, the pay rate of those making it through to final full certification will need to be higher. And the lower is the chance of making it through, the higher is the full professional pay.

In summary, we think that the evidence shows that the selection aspect of ITT is completely the wrong way round. Selection is tight to get into ITT in the first place, but once in, progression to full certification is normal and expected. The process needs to be more appropriately agnostic about likely teaching ability in the first place. It should also allow a broader group of people to try out teaching, but have a much tougher probation regime before trainees be given final certification. It makes much more sense to make final decisions later once more evidence on effectiveness has accrued.

Reforming school performance tables

Rebecca Allen and Simon Burgess

Yesterday the Government published its response to the Wolf Review on Vocational Education. The Response sets out a number of proposals, accepting all of the Review’s recommendations. These include the eye-catching scheme to ensure that young people who do not achieve C grade in English and maths at age 16 continue studying them to age 19.

The response also proposes reforms to school performance tables. This is based on a recognition that schools’ behaviour in selecting qualifications for their students is strongly influenced by the incentive structure they face. A crucial component of this structure is the published school performance tables. These tables are important in influencing parental choice of school, and school leadership teams pay them a lot of attention.

From this year, the content of the performance tables will change quite significantly. The long-standing measure of the percentage of students achieving at least 5 A* to C grades will be retained. But in addition, a differential average points score will be published for each school, which provides information on how well the school does for students at the lower and upper ends of the ability distribution, as well as at the average:

“It is vital that performance indicators do not inadvertently cause schools to concentrate on particular groups of pupils at the expense of others. To avoid this we will continue to include performance measures, like average point scores, which capture the full range of outcomes for pupils of all abilities. In addition, from 2011 the performance tables will show for each school the variation in performance of low attaining pupils, high attaining pupils and those performing as expected.”  (Wolf Review of Vocational Education, Government Response, p. 6).

This is a step forward. In our analysis we argued for exactly this measure: average GCSE points score, presented at three points in the ability distribution, low ability, average and high ability. Our criteria were functionality of the performance measure, relevance to parents and also comprehensibility. A measure is relevant if it informs parents about the performance of children very similar to their own in ability and social characteristics.  It is comprehensible if it is given to them in a metric that they can meaningfully interpret.  It is functional if it helps parents to answer the question: “In which feasible choice school will my child achieve the highest exam score?”.  Overall, this performance measure came out on top. We also described ways that the information could best be displayed for parents: paper-based and web-based delivery mechanisms.

A second issue is that the “price” or GCSE-equivalent points of the new vocational exams seems set to change. Precise details of this are unclear at the moment. It is worth making the point again that schools will have an eye on the performance table impact of courses they offer to students. If vocational qualifications are to be worth less than they are at present receive, there is a danger that schools will not be keen to promote them to students who may be unlikely to score highly on more academic courses. In turn, this may make schools less keen to accept low ability pupils.

Of course, the old league table measure of percentage with 5 A* to C grades is staying. Perhaps there is a performance management version of Gresham’s Law and good performance measures will drive out bad ones. If parents come to rely more on this measure, the media will give it more prominence and the grip of the “%5A-C” measure on the public mind will finally begin to weaken.

Are school league tables any use to parents?

Rebecca Allen and Simon Burgess

Today is “school league tables” day. Performance tables are released for schools and colleges in England, reporting a number of different measures of the exam performance of their students. While much attention this year will focus on the reporting of the new “English Baccalaureate”, we ask a more fundamental question: are school league tables in general any use to parents?  One of the major aims for school league tables is to support and inform parents in choosing a school for their child: but are they fit for this purpose? The answer is “yes” – we show that using school league tables does help parents to identify the school in which their own specific child will do best in her future exams.

Parents consistently rank academic standards as being one of the most important criteria for choosing a school. The performance tables provide outcome measures that are very widely reported and easy to get hold of. The idea is that parents can scrutinise the results and weigh up the merits of the local schools, considering the academic performance, travel distance, the child’s own wishes and other factors before deciding which schools to write down on their application form.

But this idea has been subject to a number of critiques. There are three main lines of argument. First, it is argued that differences in raw exam performance largely reflect differences in school composition; they do not reflect teaching quality and so are not informative about how one particular child might do at a school. Second, schools might be differentially effective so that even measures of average teaching quality or test score gains may be misleading for students at either end of the ability distribution. Different school practices and resources might be more important for gifted students or others for low ability. Third, it is argued that the scores reported in performance tables are so variable over time that they cannot be reliably used to predict a student’s future performance. After all, today’s league tables reflect last year’s students’ exams, but a parent wants to know how her child will do in five years time.

It is an empirical question how quantitatively important these points are: are league tables helpful or not? The question on academic standards that parents want answered is: “In which feasible choice school will my child achieve the highest exam score?”. We argue that the best content for school performance tables is the statistic that best answers this question.

To answer this question, we use the long run of pupil data now available to researchers.  We can follow students through their years at secondary school and see how they did in the exams at the end; that is standard. But we can also use statistical procedures (details) to estimate the counter-factuals of how that student would have done if s/he had gone to a different local school. We can then ask: if families had picked schools according to the league table information available at the time, would that have turned out to have been a good choice in terms of subsequent exam performance for that specific child? Focussing on the simplest measure of the school’s %5A*-C score, the results show that while it certainly does not produce a good choice for everyone, it produces a good choice for twice as many students than it produces a poor choice for. So on average, a family using the schools’ %5A*-C scores from the league tables to help identify a school that would be good academically for their child will do much better than the same family ignoring the league table information.

So are the league tables useful for parents? Definitely.  Can they be improved? Certainly.  The measures included in the performance tables should be judged according to their functionality, relevance, and comprehensibility. The test of functionality is the analysis just described. A measure is relevant if it informs parents about the performance of children very similar to their own in ability and social characteristics.  It is comprehensible if it is given to them in a metric that they can meaningfully interpret. In fact, none of the current leading performance measures score very well across our three criteria. We have proposed an alternative measure that performs better on these criteria. No measure can be perfect because there are important trade-offs between relevance, functionality and comprehensibility: the more disaggregate the form in which performance tables are provided (increased relevance), the less precision they will have (decreased functionality). The more factors are taken into account in describing school performance for one specific child (increased relevance), the more complex the reported measure will be (decreased comprehensibility). Any choice on the content of league table information has to make decisions on these trade-offs.

Response to Education White Paper I: Teachers and teaching

Rebecca Allen and Simon Burgess

 

The focus of the new Education White Paper (WP) is advertised in the title: “The Importance of Teaching”. Teachers are rightly lauded as the most important single factor in creating a good education. The reforms relate principally to training new teachers, with additional discussion of the constraints and bureaucracy that teachers face.  The White Paper calls for shifting the emphasis of teacher training from university-based to school-based training, the argument being that this is where the “craft” of teaching is better learnt, and that this will generate more effective teachers.

We believe that the WP presumes more robust evidence on this issue than actually exists. It is hard to legislate on the best way to train teachers when we are not really sure what makes a good teacher, or what effective teachers do. We need to be realistic in terms of what we know, and also in terms of the wider context around teacher development.

There are a number of prior questions that need more robust answers than they currently have to properly address this policy issue. For example: To what extent are good teachers born or made? What do effective teachers do? What motivates teachers? We discuss new teachers first and then existing teachers.

The two key issues around new teachers are recruitment and training. The research evidence suggests that the recruitment of teachers matters a great deal. This evidence can be used to design the ideal personnel policy, the ideal contract for teachers. The facts are that teachers are very different in effectiveness but that this is hard to spot pre-hire as it does not appear to be well correlated with characteristics such as degree class or subject; and that this level of effectiveness tends not to increase with experience after the first two or three years. The current teacher entry system involves making the sharpest selection before training (to be raised to a good university degree), giving training, but thereafter only mild selection: that is, most people pass their training, and then passing probation (achieving QTS) is relatively straightforward in most schools. The evidence suggests a better policy would be exactly the reverse: a much more open and inclusive approach to who can begin teacher training, coupled with a much tougher probationary policy.

It is hard to give strong advice about a model for teacher training, given only a sketchy idea of how effective teachers operate. But in practical terms, students on teacher training courses already spend about two thirds of their time in school rather than in the university lecture hall; the scope for major gains from further time in school does not seem large. Furthermore, a timely OFSTED report on initial teacher training found more outstanding university-based teacher training courses than outstanding school-based ones. The implications for schools of taking a larger role in teacher training also need some consideration, particularly given the squeeze in resources that is coming.

There are about 400,000 teachers in England, and the turnover is about 20,000 per year. So even if the average effectiveness of new teachers can be significantly improved, this will only have a marginal impact on overall effectiveness for at least a decade. Increasing the effectiveness of existing teachers offers much greater scope for rapid improvements in standards.

The counter-part to focussing initial training on schools is to emphasise and enhance training on the job, continuing professional development (CPD). The picture painted by the economics evidence suggests a model of informal, small-scale, within-school or even within-department groups would work well, with colleagues learning from the most effective teachers.   Whilst CPD is discussed at some length in the WP, it has not been the focus of interest and discussion that it should be.

The broader question is why this has to be pushed towards teachers, why there isn’t much of a demand for it from most teachers. Raising the value of being an effective teacher might help fuel this demand. We know that teachers do raise their teaching effort given incentives, and it seems likely that they would also be keener to invest in their own capability to be effective. This incentivisation could be very simple and need not be personal financial gain. It could be simple pride and satisfaction from being top of a list of teachers in the staff room, or additional resources for a project chosen by the teacher, or it could be a pay bonus for the teacher.

The focus on teachers and teacher effectiveness is to be applauded. It is less clear that the right policies have been selected to enhance this.