THIS STORY HAS BEEN FORMATTED FOR EASY PRINTING

Grade the teachers

A way to improve schools, one instructor at a time

By Michael Jonas
November 1, 2009

E-mail this article

Invalid E-mail address
Invalid E-mail address

Sending your article

Your article has been sent.

  • E-mail|
  • Print|
  • Reprints|
  • |
Text size +

A good teacher equals a good school year. Not always, but far more often than not. Ask any parents of an elementary-grade child how the school year is going, and it won’t be long before you’ll hear them rave about - or bemoan - the teacher their child has been assigned to. There are teachers who are duds, who can find a way to drain the fun out of a unit on dinosaurs for second-graders. And there are those with a gift for reaching the eighth-grader slouched in the back of the classroom with a penchant for eye rolling. These teachers can bring life to Poe’s fascination with the dead, or deliver just the right contemporary analogy to make sense of the War of 1812.

Nearly everyone can probably recall a teacher who lit their passion for poetry or who was able to help them connect all the dots in a seemingly incomprehensible algebra formula. We know that individual teachers can make a huge difference.

But public schools in America have been bent on ignoring the obvious: Almost nothing about the way we hire, evaluate, pay, or assign teachers to classrooms is designed to operate with that goal in mind. Most teachers receive only cursory performance evaluations, with virtually every teacher graded highly. We use a one-size-for-all salary structure, in which the only factors used in raises are teachers’ higher-education credentials and number of years in the system, neither of which is strongly linked to their effectiveness. And we often let seniority, rather than merit, drive decisions about where a teacher is placed. It is in many ways an industrial model that treats teachers as identical, interchangeable parts, when we know that they are not.

Now, increasingly challenging this status quo is a new wave of research showing that one can actually measure the difference a teacher makes. The studies use a statistical analysis of standardized test results to measure the “value added” that each teacher contributes each year, revealing stark differences in their ability to move a class forward. According to one recent value-added study of Los Angeles schools conducted by Harvard economist Tom Kane, having a good teacher for a single year translates to a 10-point-higher score on student achievement tests that use a standard 100-point scale. “That’s a big difference,” says Kane.

These value-added studies are fueling a high-stakes debate over teacher policies, with some proposing using the technique to tie teacher pay to what is happening in each classroom. That argument is now being made at the highest levels in Washington, where there is growing interest in making measures of teacher effectiveness a central part of school reform.

“Teacher evaluation in this country is fundamentally broken,” says Arne Duncan, President Obama’s education secretary, in an interview in Boston. Duncan says value-added studies should never be the sole basis for evaluating teachers, “but to act like teaching doesn’t impact student achievement, I think, is an absolute slap in the face of the profession.”

Duncan is urging that aggressive efforts to improve teacher effectiveness be a major part of the reauthorization of No Child Left Behind, the school reform law that Congress may take up in the coming year. In the meantime, he is overseeing a $4.3 billion fund, dubbed the Race to the Top program, which will make competitive grants to states pursuing innovative school reform strategies in four big areas, one of which is developing, rewarding, and retaining effective teachers.

Last week, Massachusetts state education officials unveiled a value-added data system that can track the growth in individual students’ achievement from year to year. State officials have not laid out any formal plans to use it to assess teacher effectiveness, but it seems clear that there is interest in moving in that direction, and the state plans to cite the new database in its application for Race to the Top funds.



In 1966, the federal government released a seminal report titled “Equality of Educational Opportunity.” Written by James Coleman, a prominent sociologist, the report attempted to untangle the various influences on student performance in American schools. The study, widely known simply as the Coleman Report, concluded that “only a small part of [student achievement] is the result of school factors, in contrast to family background differences between communities.”

Since then, study after study has shown the strong connection between forces outside schools - parenting, family stability, socioeconomic background - and achievement levels. Eric Hanushek was a Harvard graduate student in economics when he was selected, in 1966, to join a yearlong seminar to help formulate policy recommendations based on the Coleman Report. Though Hanushek agreed with the overarching finding that family background has an important effect on student achievement, he says the idea that schools and teachers were not an important variable struck him as off-base.

“I thought that sounded kind of crazy, and it launched me into all this work,” says Hanushek, now a Stanford University researcher who has spent the ensuing four decades studying American schools.

Starting in the 1970s, Hanushek became one of the first researchers to try to quantify the impact of teachers on student learning. Since student achievement tends to rise along with family income and other nonschool factors, the challenge was to try to isolate the actual effect of teachers on learning. Hanushek and a North Carolina statistician named William Sanders were early developers of the value-added model. By looking at how much a student has progressed in a year, regardless of where he or she started from, the model claims to capture the true effect of a given teacher. “You take most, if not all, the socioeconomic issues off the table,” says Sanders.

In 1992, Hanushek conducted one such study in schools in Gary, Ind. He ranked teachers based on the average growth in achievement shown by students in their classes, and then he compared the difference in achievement progress over a school year by students in classes taught by teachers ranked the most effective with that of students in classes of the lowest ranked teachers. The difference amounted to a full year of learning, with students of the lowest performing teachers gaining half a year (compared with average achievement growth) while students of the highly effective teachers gained one-and-a-half years.

Over the last several years, a flurry of value-added studies have been carried out. In a 2004 study of two New Jersey school districts, Columbia University economist Jonah Rockoff found that student achievement test results in reading and math were about 10 points higher, on a 100-point scale, for those in the classrooms of top teachers. In a 2007 study of Chicago schools, led by Daniel Aaronson of the Federal Reserve Bank of Chicago, students assigned to a top-rated teacher made about 20 percent greater gains in math over a school year.

“What these data are telling us is that the solution to the problem is right in front of us, if we could just get much more serious about identifying and rewarding effective teaching,” says Harvard’s Kane.

Critics of value-added assessments contend that there is far too much room for these studies to miss factors that might account for apparent teacher-effectiveness differences but actually have nothing to do with the teacher. Jesse Rothstein, a University of California-Berkeley economist, wrote a paper earlier this year suggesting that principals do not randomly assign students of varying abilities to classrooms, a practice he says could skew results. Others have pointed to the unreliability of value-added assessments of teachers that are based on only a couple of years of data or small numbers of students.



While nearly everyone agrees that there are clear differences in teacher effectiveness, you would never know that by looking at the evaluation systems used in most public school districts. Formal review of tenured teachers is typically done every two or three years, but evaluations often provide no critical feedback: In an example of grade inflation that would make even the most generous teacher blush, virtually all teachers are routinely awarded high marks.

“In many of our districts across the country, we are still operating in the old mindset where you’re either excellent or you’re nothing, so it puts a lot of pressure on principals and headmasters to rate people higher,” says Carol Johnson, the superintendent of Boston schools.

In June, the New Teacher Project, a New York-based nonprofit, released a report that looked at teacher evaluation data from 12 school systems around the country. In districts that use a so-called binary evaluation system with just two categories (usually “satisfactory” and “unsatisfactory,” or some variant of those), more than 99 percent of teachers were judged satisfactory over the four-year period from 2003 to 2006. In districts with more categories, 94 percent of teachers were in one of the top two rating categories, while less than 1 percent were rated unsatisfactory.

In Boston, which uses a satisfactory/unsatisfactory system, 97 percent of all evaluated teachers received a satisfactory designation from 2003 through 2008. At 72 of the district’s 135 schools, not a single teacher was given an unsatisfactory evaluation. Fifteen of these are on the state’s list of chronically underperforming schools.

“This phenomenon where you get schools where year after year the kids are failing, and the teachers are all deemed to be great - that’s not a recipe for improving learning for kids in poor neighborhoods,” says Dan Weisberg, the policy director at the New Teacher Project.

Among the reasons for the uniformly high evaluations in most districts, say those who study the issue, is that principals often have little training in how to review teacher performance and they allot minimal time for evaluations in already overcrowded work days. What’s more, there is often so little riding on the outcome that negative evaluations may only sow ill will among a school’s teaching staff, while doing little to improve teacher performance.

Identifying substandard performance is not the same as doing something about it. That would require evaluations to be part of a more robust system than exists in most districts, with the potential for excellent teachers to gain special recognition - and perhaps added pay - as well as giving meaningful professional development to those with shortcomings and dismissing those who do not improve enough.

“I don’t know if it’s that the system is broken or [that it’s] the people at the head of the system,” says Anne Wass, president of the Massachusetts Teachers Association, the state’s largest teachers union. “There’s tremendous need for principal education,” she says of the evaluation process. “There are many who don’t want to make people feel bad. They kind of look the other way.”

Although Wass agrees that teacher evaluations are too perfunctory, she and other Massachusetts union leaders do not see incorporating student achievement data into them as the answer. Teacher effectiveness “isn’t something that’s easily quantified,” says Wass. “It is somebody who is able to relate to and connect with their students and the parents of their students, and is able to educate them to the fullest potential. One thing it is not is whether your kids get the highest test scores.”

No one is advocating that student test scores be used as the sole basis for teacher evaluation. But reform advocates say firm measures of student learning must be part of what teachers are judged on, along with things like direct classroom observations that assess performance against standards regarded as effective teaching practices.

Value-added assessments couldn’t serve as the only basis for teacher evaluations even if there were agreement to do so. Only about one-quarter of US public school instructors teach subjects or grade levels in which students take the standardized tests used to make value-added assessments. Harvard’s Kane is now conducting research to see how consistently classroom-based observations and other types of evaluations line up with the results of value-added assessments of a teacher’s effectiveness. The more they do correspond, he says, the greater the confidence we can have that teachers who rate highly in these nonquantitative assessments are also succeeding in promoting growth in student achievement.

Mitchell Chester, the state education commissioner, says the value-added model Massachusetts is developing will allow the state to get beyond the discussions that explain away poor performance among poor kids because of their disadvantaged background. “It takes away the excuse-making,” he says. “It takes into account where the child started from. It’s not asking the teacher whether the child was able to jump over the high bar the first time out. It’s asking, ‘Based on where the child was able to jump previously, have you helped the child vault higher?’ We cannot shy away from using evidence of student learning as part of our evaluation and feedback mechanism.”

Last year, the Bill & Melinda Gates Foundation announced that it would commit $500 million over five years to identify effective teachers and increase their numbers in schools. The foundation has brought on Kane to oversee the research, which will include efforts to identify different forms of teacher evaluation that correlate highly with quantitative assessments linked to student achievement.

As critics of value-added studies have pointed out, there are reasons to exercise some caution in developing teacher policies that take account of student achievement. But there is a bigger price to pay for failing to act and muddling on with a system that doesn’t distinguish, reward, or build on excellence. Shedding those practices, however, will involve nothing less than a cultural revolution.

“You’re trying to make performance matter, and it’s never mattered before,” says Kati Haycock, president of the Education Trust, a Washington, DC, policy organization. “If you believe it has to matter, like it does in practically every field, that’s a huge change in the tools that people have to have, and it’s a huge change in culture.”

Michael Jonas is executive editor of CommonWealth magazine, published by MassINC, a nonpartisan Boston think tank. A longer version of this story appears in the magazine’s fall issue, available at www.massinc.org.

(James Steinberg for The Boston Globe )