National standards: this data is not for ranking

Comparing individual schools by comparing their national standards data is a waste of time. Students are assessed by overall teacher judgement, not standardized testing. Getting some sensible moderation in place is one of the biggest challenges in the implementation of the standards, but people need to accept that moderation will never be as rigorous as it is under standardized testing regimes. For reasons I’ll touch on later, this is by design.

A school’s national standards data is not useful information for a parent choosing which school to pack a kid off to. (Given zoning and the high fixed costs of moving, there is limited scope for school choice in the first place, so that might be a moot point.) Far better is knowledge of what a school does with its data.

If I was at the Ministry of Education (or—heaven forbid!—a parent), here’s what I’d be looking for.

I’d look for signs that schools are using the standards sensibly on a student level. Especially for students not at standard, I’d want to see individualized learning plans, with achievable benchmarks/milestones. Ideally, these plans would be designed in and as a collaboration between teacher, student, and caregiver. Give the student a sense of direction and ownership: here’s what we want you to be able to do, here’s our plan for getting you there, and here’s how you’ll be able to feel your own progress along the way.

Over time, I’d look for signs that schools are using standards, in conjunction with the learning plans, to do some value-added appraisal of teachers. I would incorporate this formally into professional development structures. I accept that there’s only so much a teacher can be reasonably expected to do for kids who turn up hungry, have caregivers with significant reading difficulties, or who switch between schools a lot. (These are, incidentally, things that are thought to correlate pretty tightly with decile.) Placing appropriate weight on factors like this is something that I think standards will be able to do over time, even if they’re fairly messy.

These things don’t really need tightly moderated standards. If comparison between schools is what you’re really wanting, then standardized tests would produce more useful data. But standardized testing was defeated politically on grounds I think were actually quite sensible.

Standardized tests are just not a good cultural fit with New Zealand. There’s a national egalitarian streak: we don’t like putting anyone higher or lower than anyone else, and we especially don’t like it when it’s kids we’re sorting. We don’t like to pressure children academically to the point of causing stress. We worry about labelling effects on kids who are at risk of falling behind: effects that are real, powerful, long-lasting, and can be very destructive. These traits of ours are admirable, and saying “little Timmy got 35 percent in his Year 2 reading test” is offensive to just about all of them.

Besides, the literature on standardized tests is a minefield. As an example, unless they’re made meaningful to students (ie have consequences) then there’s not much to suggest they help much at all with anything. But the point of the New Zealand variant on standards is that they’re snapshots of where students are at and tools for planning. They’re not meant to be consequential for students. This next point is important so I’m going to italicize it: most students probably don’t even know when their teacher is assessing them against the standards.

Where cross-school comparisons might come in useful is in identifying stand-out schools so that successful and/or innovative practices can be, where appropriate, replicated more widely. Obviously, what works in a school within one particular cultural, social, and economic context won’t necessarily work for a school that’s in a totally different one. You do your data analysis and your case studies, then you devise sensible categories and work within them.

So long as moderation is at least better than hopeless, in time, we’ll also learn quite a bit more than we already know about the impact of social and economic factors on academic attainment during the early years of school. This is important, because it’s these early years that are assumed to matter most. National standards data, for all its flaws, is or can be made rich enough to support meaningful research that will help us improve how we teach our children.

Depending on whether or not I can be bothered, I might write up some stuff on why I don’t like deciles as analytical tools. But not this week.

National Standards

Familiarize yourself with the standards themselves (reading and writing [pdf], mathematics [pdf]). Notice that assessment against the standards is done by overall teacher judgement and not entirely, or even principally, by standardized testing.

If you want to play with the shiny data set, Luis A Apiolaza has done you a favour by putting it into something sensible and R-friendly. I recommend reading all of the case studies on Stuff before diving in. Context matters.

My own prejudices. I am a policy analyst with nil expertise in education. I support publication of national standards data. I am agnostic as to whether national standards should have been introduced in the first place, but furious at those “public servants” who sought to obstruct implementation of a lawful Government policy: no integrity.

I’m mainly interested in what the standards data can and can’t tell us, and how they might be used to improve education outcomes. The lens I’m trying to look through is how would I tackle this if I were at the Ministry of Education?