Test Mania A Mere Symptom of Bipolar Policy Disorder

By Charles Barone, DFER Director of Policy

My email inbox started blowing up yesterday over a New York Times editorial entitled, “The Trouble with Testing Mania.” A lot of people are concerned that the editorial is anti-testing, and the headline certainly lends itself to that reaction.

A close reading, however, reveals that it’s not that simple. In a society where polarization has crippled our ability to get important things done, it helps if one resists the temptation to glean which education camp the author has chosen to favor and focuses instead on the actual content. (We’ll address the politics of “The Trouble with Testing Mania” later in the week.)

I’m very pro-testing and while I could quibble with a sentence here and sentence there, overall I think The New York Times’ editors got it right. (Which is not to say they got it perfect.) Here are three key reasons why:

1. The “trouble with testing mania” referred to in the headline centers mostly around test prep, not testing per se.

“[It] has become clear to us over time that testing was being overemphasized — and misused — in schools that were substituting test preparation for instruction.” [emphasis added]

I’d be more comfortable attributing the word “mania” to the use of test prep (rather than to those freaked out by testing) if we had good data on how widespread it is. There’s anecdotal evidence for it, but there’s also anecdotal evidence that teachers and students spend a lot of time engaged in other non-constructive activities. (Try googling, for example, “teacher shows R-rated movie,” “teacher student text message,” “teacher asleep in class,” or “teacher student underage sex.”)

Still, there’s no doubt that test prep is happening, and that it likely isn’t good for kids on the other end of it. It speaks to the editorial board’s broader points on teacher preparation and training that test prep is used at all. Research suggests, counter intuitively, that test prep actually isn’t effective in raising test scores.

For example, in one study, teachers who focused on the full body of skills needed to master a subject had student score gains on the Illinois Test of Basic Skills (ITBS) that exceeded the national average by 20 percent. The students of those teachers who used a narrow, drill and kill approach had ITBS gains that were much lower than the national average. Moreover, the type of instruction offered was found to be a function of “teacher disposition and choices” rather than the particular characteristics or achievement levels of the students being taught.

Upshot: some teachers teach to the test, and some don’t. Those who do, and we have to stress this includes those who don’t choose to but are forced to by principals or superintendents, may be unwittingly hurting their students.

2. The NYT Board is for the use of testing in theory but stresses rightly that such tests should be valid and reliable e.g., “Test scores should figure in [teacher] evaluations, but the measures have to be fair, properly calibrated and statistically valid.” Few of us who believe in the importance of student testing would disagree with the assertions that many tests created in the past 10 years “were weak, and did not gauge the skills students needed to succeed” and that “most states did not invest in rigorous, high-quality exams with open-ended essay questions that test reasoning skill.” However, that does not mean current tests have no validity. While the piece does not quite say that, some clarification would have rounded out the picture.

A recent study that earned lead author Raj Chetty of Harvard a MacArthur Fellowship found that a student assigned to a teacher deemed effective based on test gains in her prior classrooms is more likely to show short-term increases in achievement than a student assigned to a less effective teacher. And, as the authors state, “the gains don’t stop there: the students who learn from that teacher are more likely to attend college, earn more, and are less likely to have children as teenagers.” Upshot: variance in student test scores can be partially attributed to individual teachers and those teachers are also associated with longer-term student outcomes.

Unlike most other education data, (e.g., the GPA’s of students in teacher training programs, our current “Widget Effect” teacher evaluation systems, or the rate at which teachers attain tenure) that are highly skewed toward creating the impression that “everything is fine,” standardized achievement tests show some differentiation that can be used to make policy decisions. Our current tests are imperfect just like every assessment ever created. But as noted in both the New York Times editorial and the comments above, its deployment was not and still is not without merit.

3. Which brings us to the part of yesterday’s editorial that many people who are gleeful over a perceived NYT anti-testing editorial are likely to ignore. In comparing high achieving countries that use testing less with the U.S, the board wrote: “Perhaps most important, they set a high bar for entry into the teaching profession and make sure that the institutions that train teachers do it exceedingly well.”

The fact of the matter is that, unlike those countries, the U.S. has not only a much weaker system of teacher recruitment and selection, but a poor track record across the board when it comes to differentiating between effective and ineffective teachers and between successful and failing education systems. This was what gave rise to testing-driven reforms in the first place.

There are two separate broader points that bear teasing out: 1) Better teacher preparation could lead to teachers responding more constructively and wisely to accountability systems, whatever their components; 2) More selectivity of teacher candidates and more rigorous education and training by teacher preparation programs could provide the biggest bang for the ed reform buck.

Improving teacher preparation would certainly help teachers and administrators choose more wisely in the practices they employ to boost student achievement. It would also take a lot of the pressure off current accountability systems to drive change, because a greater number of teachers would come into the profession with an attitude of professionalism and an appreciation of the need for rigor. Here, an ounce of prevention truly may be worth a pound of cure.

If, as promised by those spearheading such efforts, the next generation of tests is better than the last, then both instruction and assessment will address a broader range of skills and will be aligned much more closely with pre-service and in-service training of teachers. This doesn’t mean teacher preparation reform has to wait. In fact, the sooner we get game-changing reforms in the way we prepare teachers, the less the testing debate over the long term is going to matter.

Charles Barone has more than 25 years of experience in education service, research, policy, and advocacy. Prior to joining Democrats for Education Reform (DFER) full-time in January of 2009, Barone worked for five years as an independent consultant on education policy and advocacy. His clients, in addition to DFER, included the Citizens’ Commission on Civil Rights, the Education Trust, The Education Sector, and the National Academy of Sciences. Read more here.