How to write a performance review

How to Write a Performance Review for a Teacher

Most teacher reviews I read feel like they describe someone the writer hasn't actually watched teach. The result is a document the teacher reads once, files away, and quietly loses respect for. Here's how to write one that doesn't do that.

11 min read·Updated 12 May 2026

I’ve read a lot of teacher performance reviews and the same problem appears in most of them. The reviews describe someone the writer hasn’t actually watched teach. They lean on test scores the writer hasn’t unpacked, on parent feedback the writer is paraphrasing second hand, on observation notes from a single 20-minute visit that happened nine months ago. The teacher reads the review, recognises that it could have been written about almost any colleague, and the document quietly loses authority for the next twelve months of conversations.

This is a guide to writing reviews that don’t do that. It’s aimed at heads of school, deputy heads, principals, assistant principals, subject leads, and anyone else who has to write reviews for the teachers they supervise. The frame is K-12 broadly, with examples that translate across primary and secondary contexts. The principles work the same in independent schools, state schools, and most international contexts.

Why teacher reviews are different from other reviews

Four things separate them from reviews in other professions. First, the work itself happens behind a closed classroom door. Unlike a sales rep’s call recordings or a software engineer’s pull requests, you can’t pull the artefact at will. You’ve seen the teacher teach during scheduled observations and maybe a couple of pop-ins. The rest of the year is invisible to you, which means the evidence collection is a different shape from the start.

Second, student outcomes are partial. Test data is real signal, but it’s also heavily confounded by cohort composition, prior attainment, attendance patterns, and what was happening in students’ lives outside school. A teacher with a tough cohort and modest headline numbers might be doing better work than a teacher with a strong cohort and excellent ones. The review needs to read the numbers in context, not as a verdict.

Third, there are more stakeholders with strong opinions than in any other role. Parents have views. Students have views (and they’re often more accurate than adults expect). Colleagues have views. Subject leads have views. Heads of year have views. A review that only reflects the writer’s observation impressions misses most of the available signal.

Fourth, the review will be read by people who do this job. Teachers are unusually sensitive to language that signals the reviewer doesn’t understand what teaching actually is. Words like “engagement” and “rigour” and “rapport” carry weight when they’re grounded in specific observation. Used loosely, they read as the words of someone who hasn’t been in a classroom in a long time.

Start by pulling the evidence you actually have

Before drafting a word, spend 45 minutes collecting evidence into a single document. The strong reviews are built from a wider evidence base than the reviewer usually thinks they have. Look at:

  • Formal observation notes.Whatever framework you use (lesson observation forms, Danielson, Marzano, your school’s in-house rubric), pull every formal observation from the review period. Don’t paraphrase from memory. Re-read the actual notes.
  • Informal walkthroughs and pop-ins. The brief drop-ins you did between formal observations. Even two-minute snapshots add up across a year. If you don’t track these, the review period is a good moment to start.
  • Student outcome data with context. Whatever assessment data your school uses (standardised tests, internal assessments, end-of-unit summative tasks), pulled with the prior-attainment baseline. The growth from the starting point is the signal, not the endpoint score alone.
  • Student work samples. Two or three pieces from across the cohort. The quality of marking and feedback the teacher provides is often a better signal of pedagogical care than any observation moment.
  • The teacher’s planning artefacts. Schemes of work, lesson plans for a unit you’ve observed, the assessments they designed. These show what the teacher is actually thinking about between the lessons.
  • Parent communication patterns. Not individual complaints (those are pastoral), but the general pattern. Are parents reaching out positively? Are difficult conversations being handled directly? Have you had to step in on issues that should have been resolved teacher-to-parent?
  • Colleague input.A short conversation with the subject lead, the head of year, and the teaching assistant who works closely with the class. You’re looking for patterns across stakeholders, not gossip.

Forty-five minutes here makes the rest of the work trivial. You’ll be surprised at how much you missed when you tried to write reviews from memory.

The four-section framework

1. Student outcomes

What the students achieved, in context. Lead with growth from starting position, not endpoint scores. If Year 4 came in two months below the expected reading level and finished six months above it, that’s a different story from the same endpoint score on a cohort that came in at grade level. Be specific about the cohort: number of students with SEND or IEPs, EAL learners, pupil-premium proportion. The numbers are more meaningful with the context attached.

2. Pedagogical practice

What the teacher actually does when teaching. This comes from observation evidence. Be specific. Not “effective questioning” but “in the Year 5 lesson on fractions in March, asked three follow-up questions of every student who answered correctly, which kept the class genuinely thinking rather than rushing to be picked.” The strong reviews name the specific instructional moves the teacher uses and where you observed them.

3. Professionalism and contribution

How the teacher shows up beyond their own lessons. Department contribution, parent communication, response to safeguarding concerns, reliability of marking, peer collaboration, contribution to school culture. This is the bucket where the “quiet team player” phrase usually shows up. Resist it. Name a specific moment.

4. Growth and development

Where this teacher is relative to last year, and what the next year should look like. Professional development completed (and what they took from it). Practice they adopted in response to last review’s feedback. The development priority for the coming year, named concretely. This is the section that justifies the rest of the review actually changing anything.

Common traps to avoid

The test-score-as-verdict trap

Writing the review around the headline data without unpacking the cohort and the growth. Two teachers with the same headline result are usually doing different work, and the review should reflect that. Pulling the growth-from-baseline number rather than the endpoint-only number takes ten extra minutes and produces a fundamentally different review.

The single-observation trap

Writing the review on the basis of one or two formal observations and treating those as representative. Anyone who’s ever been observed knows the observation lesson is not the average lesson. Strong reviewers weight informal walkthroughs, student work, and colleague input alongside formal observations rather than letting the formal lesson carry the whole picture.

The “lovely teacher” trap

Reviews of well-liked teachers tend to default to warmth and lose the substantive assessment. “The children love her,” “parents speak very highly,” “a real gem of the department.” These phrases mean something but they evidence very little, and a teacher whose review reads as warm-but- empty often plateaus precisely because the review didn’t push them.

The “outstanding”-tier inflation trap

Once a school has a few “outstanding” teachers, the bar starts to drift. Within two years half the staff are rated “outstanding” and the tier no longer means anything. Calibrate against the strongest teacher you have ever supervised, not against your current staff average. The teachers who deserve the top rating want it to mean something.

The 90-minute drafting flow

Plan for 90 minutes per teacher review. Less than that and the evidence collection gets cut; more and you start padding.

Minutes 0–30. Evidence collection. Open the observation notes, assessment data, planning artefacts, and parent-communication notes from the period. Dump bullets into a scratch document under each source. Don’t write narrative yet.

Minutes 30–60. Bucket and assess. Drop each evidence bullet into one of the four sections (outcomes, pedagogy, professionalism, growth). Some bullets fit two buckets; pick the primary one. Note where the evidence is thick and where it’s thin. Thin evidence is itself a signal worth naming in the review.

Minutes 60–90. Draft and sharpen. Write each section in five or six sentences, leading with the strongest evidence. Then cut anything generic. If you wrote “builds strong rapport” or “effective classroom manager,” replace it with the specific observation moment that evidences it.

What to do when you’re stuck

Three common stuck-points come up.

“Their headline data is poor but the teaching looks good.”The review should say that explicitly. Name the headline result. Name the cohort context. Name the specific pedagogical practices you’ve observed and what they should produce over a longer time horizon. A teacher with strong practice on a tough cohort whose results lag is a different story from a teacher whose practice and results both lag, and the review needs to make the distinction.

“I haven’t observed them enough to write a confident review.” Address that directly. Be honest about the evidence base. Pull in the colleague input and student-work review more heavily. Make it a developmental priority for next year that the observation cadence is more regular. A review with a thin evidence base, written honestly, is better than a thick-prose review built on guesswork.

“They’re plateauing and I’m not sure how to name it.”Plateau in teaching usually shows up as a flat range of instructional strategies, a stable but unchanging set of planning habits, and a reluctance to engage with feedback. Specifically: classroom routines that have stopped evolving, schemes of work that haven’t been revised in two years, professional learning that consists of mandatory CPD without elective engagement. Name the pattern. Suggest the specific next step: a peer observation cycle, an action-research project, a TLC group on a specific instructional move.

Don’t let the drafting itself be the bottleneck

Most of the work above is the thinking: pulling the evidence, reading the observations carefully, separating the cohort context from the teaching contribution. The actual typing-up is maybe 25 to 30 minutes per teacher if you’ve done the prep. For a department lead with 8 teachers to review, that’s still 4 hours of writing on top of 12 hours of evidence work. If you want to compress the writing time without losing the substance, this is exactly what Crestento is built for. The K-12 teacher system prompt is calibrated to classroom language, the four sections map onto the structured input, and the AI won’t invent an observation moment or a student outcome you didn’t provide.

For the rest of the cluster on writing and receiving teacher reviews:

Frequently asked questions

How long should a teacher performance review be?

About 500 to 800 words. Long enough to cover the four sections (student outcomes, pedagogical practice, professionalism, growth) with specific observation moments and contextualised data in each. Teacher reviews under 400 words usually lean on observation impressions without grounding them. Reviews over 1,200 words tend to pad with warmth that evidences nothing.

How do I write a performance review for a teacher with disappointing test scores?

Lead with the cohort context (prior attainment, SEND proportion, EAL learners, attendance patterns). Then assess the teaching specifically, separated from the outcomes. Name the pedagogical practices you've observed and what they should produce over a longer time horizon. A teacher with strong practice on a tough cohort is a different review than a teacher with weak practice, even if the headline scores look similar.

How many observations should I do before writing a teacher performance review?

Two formal observations and four or five informal walkthroughs across the year is the minimum I'd want for a confident review. The formal observations give depth on planning and execution; the walkthroughs reveal what the teaching looks like on a normal Tuesday. If you have less than this, the review should be honest about the thin evidence base and make observation cadence a developmental priority for the next year.

Should student feedback be included in a teacher performance review?

Yes, with care. Aggregated student-voice signal (from end-of-year surveys, departmental data, or structured focus groups) carries real weight and is often more accurate than adults expect. Individual student complaints are pastoral matters, not review evidence. The signal you're looking for is patterns across the cohort, not single anecdotes.

How do I handle disagreement between observation evidence and student outcome data?

Name the divergence explicitly in the review. Strong pedagogy with lagging outcomes usually means either the cohort is harder than the headline suggests, the teacher is in a multi-year build, or there's a specific gap between observed practice and assessed content. Weak pedagogy with strong outcomes usually means the cohort is doing the work despite the teacher. Both situations have honest paths forward; treating the data and the observation as confirming each other when they don't is the failure mode to avoid.

Draft your next K-12 Teacher (Independent / Private School) review with Crestento

Bullet points in, polished draft out. Two free reviews, no card required. The free tier IS the trial.