Recently, I wrote about the Reading First Impact Study in this space. That struck a nerve and received much attention and generated many questions. Given that, here I will answer some of these inquiries. Feel free to send more along and I’ll see what I can do. Hope this helps readers to better understand this study; various political statements recently have suggested that many politicians, at least, don't get it.
The Impact Study showed that the Reading First schools made no improvements in reading comprehension, so the program did not work, right?
No, that’s not correct. The Reading First Impact Study collected no comparison data from these schools for years prior to their entry into the Reading First program. With no comparable data from previous years, it is impossible to determine if the Reading First schools improved or not.
That makes no sense… the schools had to have test data from previous years to be identified as Reading First eligible. Why not use those test data?
There definitely are past year’s test data from all of the Reading First schools and from all or most of the comparison Title I schools. Unfortunately, these data are not comparable or combinable. The problem here is that with so many states and districts participating in the Impact study, it would be very expensive and nearly impossible to do enough equating studies. Of course, many districts and states have been reporting that they are doing better on their local measures. This may seem like the state data are contradicting the national data, but they are not. The Reading First Impact Study did not attempt to measure these kinds of improvements against past years’ achievement levels, though at least some of the states did and Reading First did well in those analyses.
If the Impact Study didn’t look at reading improvement against past years’ performance levels, what did it look at?
The Impact Study compared reading achievement gains for Reading First and non-Reading First schools. The study attempted to tell, from the beginning of the study to the end of the study, whether kids learned to read better as a result of being in Reading First schools.
Is it true that the comparison group schools were too different from the Reading First schools to allow a fair comparison?
No. Although the kids and the schools were not exactly the same at the beginning of the study, the research design that was used provides the closest comparison possible without a randomized control trial. If the schools had been randomly assigned to the Reading First treatment and control group it would have been an even better study. Nevertheless, the regression discontinuity design that was used should have provided a fair and conservative test of the effectiveness of Reading First since it guarantees a comparison with the most similar non-Reading First schools with regard to initial reading achievement, student mobility, free lunch eligibility, and so on. Reading First schools tended to be very slightly worse performing in these various measures at the beginning of the study, but not significantly so.
Why not just do a randomized control trial if that is better?
The Institute of Education Science that commissioned and supervised this study definitely wanted to do it that way, but the study began too late and the Reading First money was already distributed by the time the study was under way. Various Department of Education officials and consultants were very angry about this but ultimately agreed with the research experts and statisticians that regression discontinuity would provide the best comparison under the circumstances.
If the study compared the performance of Reading First schools with the performance of very similar controls, then doesn’t the study show that Reading First doesn’t work?
That was certainly the idea of the study and it may be showing that, but there are reasonable alternative explanations of the data that can’t be ruled out. That’s the problem. If everybody was nearly equal at the beginning of the study in instructional context and student achievement and then you put Reading First programs in half the schools, you should be able to determine whether Reading First kids were advantaged. That was the idea of the study. But it might not have worked the way it was planned.
Why didn’t the study work?
The assumption that the comparison group schools would continue with their initial instructional practices while the Reading First schools were changing theirs seems not to have been met. Researchers refer to this as “contamination.” For a comparison to work, the two groups have to engage in different practices. If they are doing the same things, why would you expect outcome differences to result for one of the groups?
For example, if I were setting up an experiment, I would try to arrange it so that my experimental and control classes were in different schools if possible, because teachers may share ideas in the teacher’s lounge and then my experimental innovations may start appearing in the control classrooms. The more this happens, of course, the less chance I have of finding differences in the end, and when my experiment shows no effects does it mean that my treatment didn’t work or just that it worked in both sets of classrooms.
Is there any reason to believe that kind of contamination affected the Impact Study?
Yes, in fact, there are lots of reasons to think this was the case. First, let’s start with the Reading First law itself. The U.S. Department of Education distributed approximately $1 billion per year for Reading First. About 80% of this money was given to the states to pass onto the Reading First schools to be used to purchase materials, hire coaches, provide professional development for teachers and principals, and for interventions for struggling readers. The other $200 million per year was to be spent by the states to try to contaminate the comparison sample.
You made that up. Does NCLB really say that the states were to contaminate these data?
The law reads differently than I said it, but in fact, that is exactly the idea of it. The President and Congress recognize that there are too many failing schools for the feds to bail out all of them (that would just be too expensive). What they do instead is provide money for the establishment of quality programs with the hope that this will leverage state and local dollars towards solving the rest of the problem. In this case, they actually earmarked about $1 billion to be used by the states to try to encourage Reading First reforms through the entire school system, especially with those other failing schools. As I said earlier, the more the other schools adopt Reading First practices, the less meaningful any comparison becomes. (Not only were the states strongly encouraged to spread the program beyond the Reading First schools there were other Department of Education initiatives to encourage this: from presentations of Reading First approaches at Title I conferences to special initiatives like “Expanding the Reach” that set out to incent schools to use their Title I funding to carry out Reading First style initiatives).
Just because lots of money was spent by the feds to get other schools to adopt Reading First strategies does not mean that they actually did it, right?
That’s true. But my own personal experience in visiting various districts suggests to me that there are many examples of districts that adopted Reading First practices district-wide. Let’s say, you have 25 schools in your district and four of these schools were Reading First eligible. You take the Reading First funds and carry out the initiative in those four schools, but what about the other schools? You could continue what you have been doing in the past, or you could repurpose your funds to duplicate the Reading First efforts in all of your schools. That means 25 schools would be using the Reading First model, even though only four were funded. That’s a terrific deal for the federal government (they managed to guide a reform in a large number of schools for a relatively small amount of direct expenditure). If your district was part of the Impact Study, however, it would certainly have contaminated the sample to some extent.
This week I sent a few emails to friends around the country. I told them that I personally knew of districts that had done this kind of district-wide Reading First effort and I wondered if they knew of any others. Below I have listed the districts that this very informal (and unscientific) survey uncovered. These are sizable districts and at least some of them actually did take part in the Impact Study. I wonder what a more formal study would show? Given how many contaminated districts I identified without looking hard, I suspect we’d find that many of the Reading First reforms were intentionally duplicated by comparison schools which wreck the comparison. (I have not listed those schools and districts that partially adopted the reforms. For example, in many districts, since the Reading First schools were getting a core program, they bought the same program district wide. Or, places like Chicago, adopted DIBELS testing in all primary grade classrooms, not just the Reading First schools. These situations certainly would introduce contamination to the study, but this kind of partial replication is so common and so widespread, I would likely need to list most of the Reading First districts (and a large percentage of states).
In a survey of California Reading First's 121 school districts, 47 of the 52 respondents indicated that they have duplicated the practices of Reading First district wide.
Also, Florida has a state policy requiring that all districts have a core program, a 90-minute literacy block, and screening and monitoring assessments. Of course, some local districts in Florida have made particular efforts to carry over Reading First to their other schools (some of those are listed below). I've been told that the same is true in Alabama, but I haven't been able to verify.
The Bureau of Indian Education has expanded Reading First into 35 non-Reading First schools, too.
Here are some school districts, large and small, that adopted the Reading First reforms district wide. If I hear of more, I’ll add them to the list. You can see the problem.
East Aurora, IL
North Platte, NB
Hillsboro Co., OR
Jefferson Co., OR
Klamath Co., OR
Great Falls, MT
Laramie, WY (Albany #1)
Fort Morgan, CO
Ogden City, UT
Richmond Co., GA
North Sanpete School District, UT
San Juan School District, UT
Collier County, FL
Broward County, FL
Wilmington, DE (Christina District)
If a bunch of non-Reading First schools adopted the same reading reforms, would that mean the study was contaminated?
No matter how widely this phenomenon occurred, if it didn’t occur in the districts that participated in the study, then it would not matter. However, as I indicated above at least some of the districts in the study did follow a policy that required the use of Reading First practices in non-Reading First schools. (And my list above only includes districts that were trying to duplicate the entire Reading First effort in their other schools. There were also districts that did this more partially: for example, the Chicago Public Schools adopted DIBELS monitoring district-wide, but didn’t try to spread the entire reform package. Partial imitations are contaminating, too.)
Also, the feds did not just do an Impact Study. They carried out an implementation study that examined the instructional practices in the two sets of schools. Some of these data have already been reported and more will be reported this fall. What the already released data show is that Reading First and non-Reading First schools were quite similar in their instructional practices and that they became increasingly similar as the study progressed year to year. So, in Year 1 Reading First schools were much more likely to adopt a new, research-based core program than were the comparison schools (a big difference). But by Year 3, most of the non-Reading First schools were using the same kinds of programs (and often the identical program). The same thing seemed to happen with coaches. Coaches were prevalent in Reading First schools from the beginning, but they became increasingly available in Title I comparison schools as the study progressed. This also happened with setting aside an uninterrupted instructional block for reading, as well as for some of the other significant instructional reforms.
Usually in a multi-year experiment of this type, the impact of the treatment grows each year as more innovations are implemented and as the distance between the schools grows with regard to their instructional practices. With the Reading First study, big initial differences declined over time as other schools parroted the Reading First practices.
If lots of schools took on the Reading First reforms we might not be able to see differences among those schools, but should achievement be improving overall since so many schools would be using these practices?
That’s a fair point. And yes, that appears to be the pattern that we are seeing with the National Assessment of Educational Progress. In fact, NAEP scores have been rising during the period in question. NAEP shows small but clear significant improvements for fourth graders (particularly on their trend items) and the various local studies are saying that state test scores have been rising, too. Yes, it is possible that these increases are real and that they have been stimulated by Reading First.
Wouldn’t that mean that Reading First was actually a big success?
It would if it could be proven that the changes in instructional practices that have been taking place are actually due to Reading First. Although there clearly are instances noted above where schools adopted practices because they were used by Reading First, there are also cases where other factors may have actually led to the change. Districts like Los Angeles and Chicago had already hired reading coaches before Reading First money was even available (they were relying on the same research base used by the Reading First creators, but were acting independently). Many districts that use core programs refurbish those programs every 4 or 5 years; their latest adoption may have been a program that could be used in Reading First, but that might not have been why they selected the program. Reading First might have been the pivot point for all of these changes, or it might have been just one of many sources of information used by the districts.
Copyright © 2022 Shanahan on Literacy. All rights reserved. Web Development by Dog and Rooster, Inc.