Thursday, April 3, 2014
Thursday, May 3, 2012
They wouldn’t be allowed within miles of a school on testing days, and they would only be given general information about the results (e.g., “your class was in the bottom quintile of fourth grades in reading”). Telling a teacher the kinds of test questions or about the formatting would be punished severely, too.
In that fantasy, teachers would be expected to try to improve student reading scores by… well, by teaching kids to read without regard to how it might be measured later. I have even mused that it would be neat if the test format changed annually to even discourage teachers from thinking about teaching to a test format.
In some ways, because of common core, my fantasy is coming true (maybe Heidi K. isn’t far behind?).
Principals and teachers aren’t sure what these tests look like right now. The whole system has been reset, and the only sensible solution is… teaching.
And, yet, I am seeing states that are holding back on rolling out the common core until they can see the test formats.
Last week, Cyndie (my wife – yes, she knows all about Heidi and me – surprisingly, she doesn’t seem nervous about it) was contacted by a state department of education trying to see if she had any inside dope on the PARCC test.
This is crazy. We finally have a chance to raise achievement and these test-chasing bozos are working hard to put us back in the ditch. There is no reason to believe that you will make appreciable or reliable gains teaching kids to reply to certain kinds of test questions or to particular test formats (you can look it up). The people who push such plans know very little about education (can they show you the studies of their “successful” test-teaching approaches?). I am very pleased with the unsettled situation in which teachers and principals don’t know how the children’s reading is going to be evaluated; it is a great opportunity for teachers and kids to show what they can really do.
Saturday, August 9, 2008
Much has been made in recent years of the political class’s embrace of the idea of test-based accountability for the schools. Such schemes are enshrined in state laws and NCLB. On the plus side, such efforts have helped move educators to focus on outcomes more than we traditionally have. No small change, this. Historically, when a student failed to learn it was treated as a personal problem—something beyond the responsibility of teachers or schools. That was fine, I guess, when “Our Miss Brooks” was in the classroom and teachers were paid a pittance. Not much public treasure was at risk, and frankly low achievement wasn’t a real threat to kids’ futures (with so many reasonably-well-paying jobs available at all skills levels). As the importance and value of doing well has changed, so have the demands for accountability.
Sadly, politicos have been badly misled on the accuracy of tests, and technically achievement testing has just gotten really complicated—well beyond the scope of what most legislative education aides can handle.
And so, here in Illinois we have a new test scandal brewing (requiring the rescoring of about 1 million). http://www.suntimes.com/news/education/1099086,CST-NWS-tests09.article
Two years ago Illinois adopted a new state test. This test would be more colorful and attractive and would have some formatting features that would make it more appealing to the kids who had to take it. What about the connection of the new test with the test it was to replace? Not to worry, the state board of education and Pearson publishing’s testing service were on the game: they were going to equate the new test with the old statistically so the line of growth or decline would be unbroken, and the public would know if schools were improving, languishing, or slipping down.
A funny thing happened, however: test scores jumped immediately. Kids in Illinois all of a sudden were doing better than ever before. Was it the new tests? I publicly opined that it likely was; large drops or gains in achievement scores are unlikely, especially without any big changes in public policy or practice. The state board of education, the testing companies, and even the local districts chimed in saying how “unfair” it was that anyone would disparage the success of our school kids. They claimed there was no reason to attribute the scores sudden trending up to the coincidental change in tests, and frankly they were not happy about kill-joys like me who would dare question their new success (it was often pointed out that teachers were working very hard—the Bobby Bonds’ defense: I couldn’t have done anything wrong since I was working hard).
Now after two years of that kind of thing, Illinois started using a new form of this test. The new form was statistically equated with the old form, so it could not possibly have any different results. Except that it did. Apparently, the scores came back this summer, much lower than they had been during the past two years. So much lower, in fact, that the educators recognized that it could not possibly be due to a real failure of the schools, but it must be a testing problem. Magically, the new equating was found to be screwed up (a wrong formula apparently). Except, Illinois officials have not yet released any details about how the equating was being done. Equating can get messed up by computing the stats incorrectly, but they also can be influenced by how, when, and from whom these data are collected.
It’s interesting that when scores rise the educational community is adamant that it must be due to their successes, but when they fall—as they apparently did this year in Illinois, it must be a testing problem.
Illinois erred in a number of ways, but so have many states in this regard.
The use of a single form of a single measure administered to large numbers of children in order to make important public policy decisions is foolish. It turns out there are many forms of the test Illinois is using. It is foolish that they didn’t use multiple forms simultaneously (like they would have if it had been a research study), as this can help to do away with their “rubber ruler” problem. Sadly, conflicting purposes for testing programs have us locked into a situation where we’re more likely to make mistakes than to get it right.
I’m a fan of testing (yes, I’ve worked on NAEP, ACT, and a number of commercial tests), and am a strong proponent of educational accountability. It makes no sense, however, to try to do this kind of thing with single tests. It isn’t even wise to test every child. Public accountability efforts need to focus their attention on taking a solid overall look at performance on multiple measures without trying to get too detailed about the information on individual kids. Illinois got tripped up when they changed from testing schools to testing kids (teachers didn’t think kids would try hard enough if they weren’t at risk themselves, so our legislator went from sampling the state to testing every kid—of course, if you want individually comparable data it only makes sense to test kids on the same measure).
Barack Obama has called for a new federal accountability plan that will make testing worthwhile to teachers by providing individual diagnostic information. That kind of plan sounds good, but ultimately it will require a lot more individual testing, with single measures (as opposed to multiple alternative measures). Instead of getting a clearer or more efficient picture for accountability purposes—and one less likely to be flawed by the rubber ruler problem, it can’t help but being muddled as in Illinois. This positive-sounding effort will be more expensive and will result in a less picture in the long run.
Accountability testing aimed at determining how well public institutions are performing would be better constructed along the lines of the National Assessment (which uses several forms of a test simultaneously with samples of students representing the states and the nation. NAEP has to do some fancy statistical equating, too, but this is more likely to be correct when a several overlapping forms of the test are used each year. By not trying to be all things to all people, they manage to do a good job of letting the public and policymakers know how are kids are performing.