Putting the Test to the Test

(This originally appeared in 2003, on the 10 year anniversary of the establishment of the New York State’s mandatory standardized testing regimen, now more commonly known as “the Regents Exams.”)

After almost 10 years of controversy, New York State education officials insist that their new standardized tests have improved classroom teaching and raised the achievement levels of high school graduates.

“Prior to the early 1990s there were no such thing as education standards in New York,” said Deputy Education Commissioner Jim Kadamus. “Under the old system, we had a two-track system: some kids got the Regents, some kids didn’t.”

Yet critics of the state’s program say it may actually be doing more harm than good. And Kadamus and others cite virtually no hard statistical evidence behind most of their claims.

Milton Cofield, a member of the New York State Board of Regents, was asked at a December public meeting in Rochester to name even a single study showing that the Regents exams have been accomplishing everything the state claims.

He couldn’t.

Kadamus himself acknowledged in February that the state actually has no statistical information showing that the Regents exams have worked. Or that they should work.

“The evidence we have right now is anecdotal,” he said.

Kadamus and the Board of Regents say, despite that lack of evidence, the new exams, five mandatory Regents tests determining whether a student can graduate, are doing everything they are supposed to.

“The Regents adopted the Regents exams as the most reliable and consistent way, on a statewide basis, to measure the standards, and then they set up an assessment review panel with people from around the state to review alternatives,” said Alan Ray, the state Education Department’s director of communication.

Ray added the exams are clearly improvements over the former student assessment standards, which allowed many students to slip through the school system without a basic mastery of skills.

“The English exams added a lot more writing, the math exams added more multi-step word problems, the social studies exams had more document interpretation problems where you had to do more research,” he said.

A close examination of the history of these tests, though, shows that the state’s own panel of experts recommended against exactly the kinds of exams the state enacted. Another panel of experts told the state it had to prove that the exams work before it could institute them – which never happened.

A review of scientific research on the subject shows that there is virtually no evidence that standardized tests like the Regents exams are good for education – and a rather large body of scholarly work suggests they aren’t.

The research is significant enough that the American Psychological Association, the American Educational Research Association and the National Council on Measurement in Education all determined as far back as 1985 that no important decision on the fate of a child should ever be decided on the basis of a single test.

The state was told that but did it anyway.

 

Something Had to Change

 

The state says it needed to institute the new, mandatory Regents exams because New York had a “two-tiered” educational system that shut out the poor, minorities and people with the bad luck to be in a bad school system.

“Somebody made a decision about that. Somebody said ‘Benjamin’s smart and gets to take the Regents, Jim’s not so smart,'” he said. “And many of the people being put on the lower tier were minority kids, urban kids, poor kids from rural areas who people thought ‘it just isn’t worth it.’ When we forced everybody to take the Regents, many of the kids who weren’t even getting a shot before got their shot, and many succeeded.”

The biggest benefits, he said, are among those who suffered the worst under the previous system.

“In the suburban schools, you’re probably not noticing much difference,” he said. “Most suburban schools were getting 70, 80 percent of the kids passing the Regents anyway.

“For Pittsford and Brighton it’s really not a big change,” he said. “But where this has to have an impact is in places like Rochester and Buffalo and New York City and Utica and Jamestown.”

Combined with the exams given at the fourth- and eighth-grade levels, designed to make sure students enter high school on track, Kadamus said the Regents exams will eventually raise the bar for all students in New York.

“The minimum level of competency has gone up, and the whole system has shifted up,” he said. “We’ve established that many kids can pass the Regents exam. We haven’t established that all kids can. We’re still working on that.”

And, even though Kadamus could not cite any specific study demonstrating that the Regents was having a beneficial impact, he did say his understanding is that surveys out of the State University of New York system show improved grade point averages as a result of the Regents exams.

 

What the Research Says

 

There are a few studies – even one as late as 2003 – suggesting that a program like the Regents exams can have a good impact on students, though the state did not cite any.

One study, completed last year by Martin Carnoy and Susanna Loeb, showed that some states that enact standardized testing have higher average scores on other standardized tests like the SATs.

But the overwhelming body of research on standardized testing either strongly indicates or states directly that tests like the Regents exams are bad for classrooms and students, especially the poor and minority students the state was most trying to help.

The most recent studies were released this month and are considered definitive enough that when the American Psychological Association, the American Educational Research Association and the National Council on Measurement in Education decided to revise their testing guidelines in 1999, they retained and updated the warning that no important decision about a child should be based on the result of a single test.

“It’s just good practice across the board to make important decisions based on more than one piece of information, to try to get as much information, and different kinds of information, as possible,” said Marianne Ernesto, the American Psychological Association’s director of testing and assessment. “It’s ingrained in everything we know about assessment.”

Assemblyman Steven Sanders, D-NYC, who chairs the state Assembly’s committee on education, said the evidence presented to his committee on that point is overwhelming.

“Even the companies that make the tests, like McGraw Hill, will tell you that,” Sanders said. “It’s like the warning on the side of a pack of cigarettes: they warn you that a test should not be taken in and of itself to determine the worth of a student.”

Linda Darling Hammond, an educational specialist at Stanford University, who chaired the state’s council on curriculum and assessment in the early-1990s, likewise said that there’s little ambiguity on this point.

“I don’t know that there’s total unanimity,” she said, “but there’s a strong body of research that’s well known by a lot of scholars.”

In fact, prior to implementing the standardized tests in New York, the Regents were specifically warned by their own panel of experts not to put a system like the current Regents exams in place, according to Darling Hammond, who chaired that panel.

 

What New York State was told

 

The Regents created the new state educational standards, called “A New Compact for Learning,” in 1991. The compact called for a number of standards to be met in areas ranging from science, math and technology to career development and the arts; but the question was how to test whether schools and students were meeting those goals.

The state appointed the New York State Council for Curriculum and Assessment that same year. Made up of leading educational policy experts, the council was charged with deciding the best way to gauge whether the standards are being met.

The SUNY system weighed in immediately with a 1992 report that specifically asked the Regents to place a greater emphasis on portfolios and performance-based assessments including research projects, laboratory experiments, essays and exhibitions, “rather than short-duration standardized paper-and-pencil tests.” SUNY said the tests provided no useful information about a student’s real capabilities.

The recommendations of the council, published in a 1994 report, are a lot like what SUNY wanted, and almost nothing like the system eventually put in place.

Instead of standardized exams, the council determined that students should be required to put together a “Regent’s Portfolio” that would include special projects, research papers, writing samples and other evidence indicating mastery of skills, in addition to evaluations of teachers and some standardized tests. No single area could have ensured graduation, or kept a student from graduating. It would be considered in total.

“Every district would have created pieces that would have fit the standards, such as special science projects,” said Darling Hammond.

Council member Deborah Meier, who founded Central Park Elementary School in New York City and wrote the book “In Schools We Trust,” called it “a very balanced idea between responsibility of state and local control. It was a real leap in accountability that was tied to good data.”

The Regents endorsed the plan in 1994.

But then Richard Mills, appointed commissioner of education in 1995, reversed that decision. Instead, Mills opted to go in a new direction.

 

What New York State did

 

Prior to coming to New York, Mills headed the education system in Vermont and had instituted a portfolio system for that state, which experts said was applied unevenly from school district to school district.

The results of Mills’ decision for New York are well known: every public high school student now has to pass five standardized tests in order to graduate. No exceptions.

Mills’ office in Albany deferred comment for this article to communication director Alan Ray.

“The long and short of why the decision was made not to use (portfolios),” Ray said, is that portfolios vary too much from school district to school district. “They can not work on a statewide basis to evaluate student performance in a consistent way from, say, Long Island to Buffalo.”

Meier, asked if the curriculum and assessment council supported that conclusion or the testing system that the state then came up with to replace portfolios, said she was shocked.

“I think the state has come up with something so shabby as to be scandalous,” she said. “To use one very narrow form of standardized testing is bound to distort schooling. It’s like assuming that the only thing a corporate leadership should be interested is the bottom line this year. That’s not the only thing they should be interested in. If they are, you get Enron. Good management wants to know something about the long-term goals, what the numbers mean.

“The more you put your focus on this year’s bottom line,” she said, “the more you distort, both in the corporate world and in the education world.”

Asked if an increased drop-out rate and a narrowing of school curricula to fit the tests were predictable, Darling Hammond pointed out that the Regents exams violate the most basic tenant of testing, the American Psychological Association’s guideline that no single test should determine an important decision for a child.

“When this policy direction was adopted, there was already some indication that the use of high-stakes testing, particularly if it doesn’t allow for a range of measures of performance, would result in high failure rates for students and the potential for increased drop-out and push-out rates,” she said. “That had been documented already in states that had adopted systems like that.”

 

The Impact

 

It’s been documented in New York, too. According to Walter Haney, a senior researcher at the Center for the Study of Testing, Evaluation and Educational Policy at Boston College, New York’s graduation rate has dropped from 61 percent in 1997-1998, when the tests began being implemented, to 57.6 percent in 2001-2002, the last year for which statistics are available.

That’s one of the five worst statewide graduation rates in the United States and, Haney said, represents 250,000 students who dropped out as a direct result of the state’s policy.

The state fervently disputes Haney’s data.

“The graduation numbers, if you look at the last 10 years, have stayed the same and, in fact, have gone up a little bit,” Kadamus said. “We have the same or more graduates and we know now that you’ve got to complete five Regents tests.”

But other independent surveys compiled on the state’s drop-out rates also show declining graduation rates.

John Warren, in a 2003 paper presented before the American Sociological Association, found that the state’s graduation rate declined by 3 percent from 1995-2000 (the last year for which he calculated), a result almost identical to Haney’s.

Jay Green and Greg Forster, analysts at the Manhattan Institute, used yet another method of calculating graduation rates. They found that as of 2001, New York had the ninth worst graduation rate among all 50 states.

“Official graduation rates going back many years have been highly misleading in New York City, Dallas, the state of California, the state of Washington, several Ohio school districts, and many other jurisdictions,” their report said.

 

Are the Tests Any Good?

 

To help deal with the technical issues that crop up when designing a major new testing program, the state put together a second committee, also composed of nationally known testing and education experts, called the Technical Advisory Group (TAG).

That group, too, had its recommendations ignored – particularly on the critical issue of determining whether the tests actually measure anything at all.

In testing terms, that’s called “validity.” If a test is “valid,” then an increase or decrease in scores actually measures something, like how much a student has learned. If a test is “invalid,” then all a changing test score means is that a number has changed – it doesn’t actually refer to anything in the real world.

It’s not enough that a test be designed around a curriculum, it also has to be valid.

“Do rises on (standardized) test scores indicate, for example, rises on other measures?” asked Richard Ryan, a psychologist specializing in testing issues at the University of Rochester. “Generally no.”

Ray said the state has data showing the tests are valid.

“We’ve done a whole bunch of validity studies,” he said. “We’ve published 70 or so studies.”

He was unable, though, to name a publication the studies appeared in.

After six weeks of requests by Messenger Post Newspapers for those studies, Ray provided five.

Those studies were sent to testing experts, including Ryan, Darling Hammond and Joshua Aronson of New York University, who each independently concluded that they contained no validity data at all.

But the state was warned it needed validity data. In at least three separate memos dating from 1998 to 2000, the TAG specifically asked the state to research both validity data and to create a complete “technical manual” for the tests.

As of the time of this publication, the state has yet to prove that its tests measure anything.

 

Where That Leaves Us

 

Still, Kadamus and other members of the Board of Regents insist the tests work.

“People who are college (admissions) people tell us that they’re getting kids who are more capable of doing college-level work,” Kadamus said. “Second, business people tell us that people who are going directly into the work force have higher levels of reading and writing skills. Third, we know from other measures, the SAT scores, the NAEP (National Assessment of Educational Progress) scores – a national test given across the country – that those scores are going up. They’ve gone up significantly.”

Donald Hossler, associate vice president for enrollment services at Indiana University, though, said while the university receives many undergraduate applications from New York state, there is no evidence that either the quantity or the quality of those applications is improving.

Dan Shelley, director of undergraduate admissions at Rochester Institute of Technology, said he, too, has seen no evidence that the Regents exams are improving the quality of applicants.

Jonathan Burdick, dean of admissions and financial aid at the University of Rochester, said he believes the quality of New York applicants had actually gone down as a result of the impact the Regents exams are having on curriculum.

“The state efforts have actually made it harder for them to do some of the other creative and innovative things,” Burdick said. “I think it’s pretty clear that because there’s so much testing so many of the other qualities, creative qualities, that we look for in our students – their interest in music, their ability to come up with unorthodox ideas – that’s harder to come by because there’s so much emphasis on testing.”

Members of the business community also did not leap to the test’s defense. Sandra Parker, president of the Rochester Business Alliance, said “There seems to be enough questions about that, maybe (the tests) should be re-looked at.”

New York’s SAT scores are rising. But so are everyone else’s. In fact, while the state’s SAT scores have gone up over the last 10 years, they are behind the national average in the rate of increase – so comparatively speaking, New York’s SAT scores have gone down since the Regents exams were instituted.

U of R psychologist Ryan said he’s never understood why the state would decide to use a system like the Regents exam.

“It’s rather tragic,” he said.