9 April 2001
Our current howler: Why Kate cant read
Synopsis: Who flunked the recent NAEP fourth-grade test? Lets start with the New York Times.
Gap Between Best and Worst Widens on U.S. Reading Test
Kate Zernike, The New York Times, 4/7/01
Occasionally, angry readers write to ask us when were starting the new web sitethe site devoted to urban ed issues, which we discussed a few months back. Alas! Were working hard on a deathless book about the coverage of the 2000 campaign (The Spinning of the President, Year 2000); and we postponed the start of our incomparable new site to review press coverage of the Bush budget plan. But a front-page article in the New York Times made us wish that the site were active. Saturdays article reviewed Y2K NAEP test results. Lets take a look at what it said.
"Gap Between Best and Worst Widens on U.S. Reading Test"so read the articles page-one headline. Kate Zernikes opening paragraph:
ZERNIKE (1): Results of nationwide fourth-grade reading tests released yesterday show a widening gap between the very best students and the very worst despite a decadelong emphasis on lifting the achievement of all students.
"From 1992 to 2000," Zernike wrote, "the average reading scores for fourth graders on the National Assessment of Educational Progress, known as the nations report card, remained flat. The average score for top students increased while the average score for bottom students declined even more significantly."
Even more significantly! It sounded bad. "The release of the scores led to a round of finger-pointing over the cause of the growing gap," Zernike claimed. In paragraph four, she proved it:
ZERNIKE (4): Kati Haycock, director of the Education Trust, a nonprofit group that advocates for disadvantaged students, said the numbers spoke of "a frightening sort of educational Darwinism," adding, "It would appear that in a deeply misguided response to demands for higher achievement, schools are focusing their efforts and resources on those students most likely to succeed while neglecting the students who most need help."
A frightening sort of educational Darwinism! In paragraph five, Zernike noted other complaints. "Others said the problem was that teachers had failed to learn the best ways to teach reading," she said. (In paragraphs 8 and 9, she cited more claims. For full text, see postscript.)
Lets review. According to Zernike, there was a widening gap between the best and the worst. Average scores for top students were increasing, while average scores for bottom students declined "even more significantly." This had led to a round of finger-pointing, she said. She quickly gave one excited exampleand never cited anyone saying that there may not be a "problem" at all.
Is there anything in the NAEP results to justify this sort of coverage? Alas! A closer look at Zernikes work suggests where the biggest incompetence may lieamong some of the nations excitable adults, not among the nations schoolkids.
LETS TAKE A LOOK AT THE ACTUAL SCORES THAT PRODUCED this coverage in the Times. How did the fourth-graders really do? When we get inside the actual figures, the changes in scores are remarkably small. For example, in paragraph 12, Zernike notes an intriguing fact; "the average score [of all tested students] in 2000 was 217, the same as in 1992," she writes. As it turns out, the scores recorded by top and low scorers hadnt changed very much, either.
How did the top scorers do? Zernike provides the scores achieved by students at the 90th percentile. In 1992, she says, students at the 90th percentile scored 261 on the NAEPs 500-point scale. In 2000, the score was 264. She quotes a NAEP spokesman saying that the change in scores is "statistically significant" (no elaboration on that claim is offered), but one cant help being skeptical, given the tiny score change. Indeed, all the way down through the scores, the changes are very slight. According to a chart which Zernike provides, scores also went up by 3 points at the 75th percentile (from 242 to 245). At the 50th percentile, scores went from 219 to 221. At the 25th percentile, scores dropped by one point; kids at that level scored 194 in 1992, 193 in the year 2000. Surely, even alarmists arent going to say that this one-point drop on a 500-point scale is significant. And rememberthese scores are derived from a national sample of 8000 kids, from whom Zernike attempts to draw conclusions about the much larger national student population. But there is no such thing as a "perfect" sample; the slightest change in the composition of the 8000-student group could account for a tiny, one-point change. It is almost impossible to draw conclusions from score changes that are as slight as these. It is absurd to draw any sweeping conclusions from the score changes shown at these levels.
Zernikes overview does reflect one reality. The largest score change comes near the bottom of the scale; at the 10th percentile, the average score dropped from 170 to 163. Based on that score change, it may well be that the nations lowest-achieving students were somewhat more capable in 1992 than they were in the year 2000. (Its possible; its surely not obvious.) But does that reflect some change in the way the schools have performed? Zernike is quick to quote one loud alarmist crying out about "social Darwinism;" Haycock, the person Zernike quotes, simply assumes that this change in scores reflects a change in the work of the schools. But that conclusion is far from obvious. For example, has the student population changed in significant ways? If more kids now speak English as a second language, for example, it would hardly be surprising if scores dropped near the bottom of the scale. This would not necessarily reflect any change in how the schools were performing. Are more kids coming from troubled homes, however defined? Again, a change in scores could result from this, not from the conduct of schools. Two things, then, can be said about this change in the score at the 90th percentile. Its hard to know if this change means anything at allthat is, its hard to know if the entire population would have scored like this, if all the nations students had been tested. And if the entire population did score like this, it would be hard to know if that change had resulted from a failure on the part of our schools, or from some change in the student population. But Zernike fails to note the problems inherent in sampling, and rushes to quote a loud "finger-pointer," who asserts that the schools have done something that is horribly wrong. There is no evidence offerednone at all; noneto back up Haycocks loudmouth assertion. But there it is, in paragraph 4, with no countervailing outlook provided, and with no one ever noting that it is very difficult to draw conclusions from data like these.
FINAL POINT: ANYONE WHO FOLLOWS EDUCATION REPORTING will notice one thingour press corps simply loves to report how dumb our schoolchildren are. But the comedy almost always comes from the incompetence of the reporters themselves. Zernike pens an excited tract based on exceptionally limited data. And how carefully has she reviewed those data? Consider this error-strewn passage:
ZERNIKE (12): The Department of Education reports the scores on a scale of 0 to 500 and by achievement levels: below basic, basic, proficient or advanced. The average score in 2000 was 217, the same as in 1992. The average scores of students in the bottom level dropped 7 points, to 163 from 170, and the scores in the top level rose to 264 from 261. In both cases the changes, while small, were statistically significant, said Gary W. Phillips, the acting commissioner for the department's National Center for Education Statistics.
(13) The percentage of students scoring at the advanced level increased to 8 percent from 6 percent between 1992 and 2000, and the percentage above proficient rose to 32 percent from 29 percent. The percentage below basic, 37 percent, barely changed.
In paragraph 12, Zernike says that the average score of the "advanced" group was 264 in last years testing. But the chart included with her article shows that this statement is simply wrongaccording to the chart, 264 was the score attained by the 90th percentile child, who was not even part of the eight percent scoring at the "advanced" level (see paragraph 13). Similarly, Zernike says that 163 was the average score of students in the "below basic" group. Thats wrong too163 was the score recorded by the kid at the 10th percentile (37 percent of the kids scored "below basic"). These errors play a minor role in the overall story, but they show the lack of attention and skill which the Times has brought to this story. Indeed, all over the press corps, reporters and editors who cant describe the simplest facts offer sweeping judgments about the schools. In this article, the Times fails to report the simplest facts about the scores attained on this test. But there is the Times, out on page one, quoting a set of loudmouth spinners, who make a set of wholly unsupported claims about what these test scores surely must show.
The world isnt going to come to an end because of Zernikes article. But routinely, the ed press makes much more significant errors in reporting the state of our schools. Who pays the price for this incompetence? The burden falls where the help is most neededon children attending our city schools. Those children are part of an educational disaster; they are owed a careful review of their plight. But Zernike cant even describe simple test scores. What are the odds that shell ever figure out whats up in our troubled city schools?
More excitement: Everyone Zernike quoted or cited was in a tizzy about the new test scores:
ZERNIKE (8): Federal education officials called the scores disturbing and a sign that education colleges were not imparting the latest ways to teach reading. National reports show plenty of evidence about the best methods, they said, but in the field, educators are still warring between whole language and phonics, and the proven methods are not filtering down to those who need them most. The best method, several researchers and national panels have said, is neither pure whole language nor pure phonics but more of a hybrid, which would emphasize teaching children to decode the meaning of words.
(9) "Although we talk about reform, not all the classrooms of America are seeing this reform," said Marilyn Whirry, a teacher in California and a member of the National Assessment Governing Board, which oversees the test.
Everyone Zernike cited assumed that the data showed a real decline among low achievers. And everyone assumed that this real decline was somehow being caused by the schools. No one said that the test score changes were minor. And no one said that the decline in performance, even if real, may not have been caused by the schools.
The occasional update (4/9/01)
Third way: Meanwhile, Andrew Toppo of the Associated Press came up with a third explanation. According to Toppo, that "163" wasnt the average score of the "below basic" group. And it wasnt score at the 10th percentile, either. Toppo penned a third account. Heres his work from the Chicago Tribune:
TOPPO: But while students in the top 10 percent increased their average score a bitfrom 261 to 264the average scores of readers in the bottom 10 percent dropped from 170 to 163.
According to Toppo, that 163 was the average score of kids in the bottom 10 percent. We are assuming that thats incorrect, but Andrea Billups of the Washington Times reported it that way too:
BILLUPS: Worse still, the gap between the nations best readers and its most struggling readers continues to widen, with students who scored in the top 10 percent of the exams increasing their average score from 261 to 264, while those in the bottom 10 percent fell from 170 to 163.
So, if you were reading about the NAEP test on Saturday, you had your choice about that 163. It was one of these:
- The average score of the "below basic" students (the bottom 37 percent)
- The average score of the bottom ten percent
- The score of the kid at the 10th percentile
Everyone agreedthere was a "163" in the NAEP report. They just couldnt figure out what it meant.
Our question: As we continue to study why Johnny cant read, shouldnt we study the press corps too?