Sunday, January 25, 2015
Sunday, May 18, 2014
IRA Talks 2014
Thursday, April 3, 2014
Thursday, February 6, 2014
That said, I have noticed that as a result of high stakes accountability linked to teacher evaluations there seems to be a bit of a shuffle around identifying students for special education. While we are encourages to "identify early", the Woodcock Johnson rarely finds deficits that warrant special education identification. Given current research on constrained skills theory ( Scott Paris) and late emerging reading difficulties (Rollanda O’Connor), how do we make sure we are indeed identifying students early?
If a student has been with me for two years (Grades 1 and 2) and the instructional trajectory shows minimal progress on meeting benchmarks, (despite quality research-based literacy instruction), but a special education evaluation using the Woodcock Johnson shows skills that fall within norms, how do we service these children? Title I is considered a regular education literacy program. Special Education seems to be pushing back on servicing these students, saying they need to "stay in Title I." Or worse, it is suggested that these students be picked up in SPED for phonics instruction, and continue to be serviced in Title I for comprehension.
I am wondering what your thoughts are on this. The "duplication of services" issue of being service by both programs aside, how does a school system justify such curriculum fragmentation for its most needy students? Could you suggest some professional reading or research that could help me make the case for both early identification of students at risk for late emerging reading difficulties, and the issue of duplication of services when both Title I and SPED service a student?
This is a great question, but one that I didn’t feel I could answer. As I’ve done in the past with such questions: I sent it along to someone in the field better able to respond. In this case, I contacted Richard Allington, past president of the International Reading Association, and a professor at the University of Tennessee. This question is right in his wheelhouse, and here is his answer:
I know of no one who advocates early identification of kids as pupils with disabilities (PWDs). At this point in time we have at least 5 times as many kids identified as PWDs [as is merited]. The goal of RTI, as written in the background paper that produced the legislation, is a 70-80% decrease in the numbers of kids labeled as PWDs. The basic goal of RTI is to encourage schools to provide kids with more expert and intensive reading instruction. As several studies have demonstrated, we can reduce the proportion of kids reading below grade to 5% or so by the end of 1st grade. Once on level by the end of 1st about 85% of kids remain on grade level at least through 4th grade with no additional intervention. Or as two other studies show, we could provide 60 hours of targeted professional development to every K-2 teacher to develop their expertise sufficiently to accomplish this. In the studies that have done this fewer kids were reading below grade level than when the daily 1-1 tutoring was provided in K and 1st. Basically, what the research indicates is that LD and dyslexics and ADHD kids are largely identified by inexpert teachers who don't know what to do. If Pianta and colleagues are right, only 1 of 5 primary teachers currently has both the expertise and the personal sense of responsibility for teaching struggling readers. (It doesn't help that far too many states have allowed teachers to avoid responsibility for the reading development of PWDs by removing PWDs from value-added computations of teacher effectiveness).
I'll turn to senior NICHD scholars who noted that, "Finally, there is now considerable evidence, from recent intervention studies, that reading difficulties in most beginning readers may not be directly caused by biologically based cognitive deficits intrinsic to the child, but may in fact be related to the opportunities provided for children learning to read." (p. 378)
In other words, most kids that fail to learn to read are victims of inexpert or nonexistent teaching. Or, they are teacher disabled not learning disabled. Only when American schools systems and American primary grade teachers realize that they are the source of the reading problems that some kids experience will those kids ever be likely to be provided the instruction they need by their classroom teachers.
As far as "duplication of services" this topic has always bothered me because if a child is eligible for Title i services I believe that child should be getting those services. As far as fragmentation of instruction this does not occur when school districts have a coherent systemwide curriculum plan that serves all children. But most school districts have no such plan and so rather than getting more expert and more intensive reading lessons based on the curriculum framework that should be in place, struggling readers get a patchwork of commercial programs that result in the fragmentation. Again, that is not the kids as the problem but the school system as the problem. Same is true when struggling readers are being "taught" by paraprofessionals. That is a school system problem not a kids problem. In the end all of these school system failures lead to kids who never becomes readers.
Good answer, Dick. Thanks. Basically, the purpose of these efforts shouldn’t be to identify kids who will qualify for special education, but to address the needs of all children from the beginning. Once children are showing that they are not responding adequately to high quality and appropriate instruction, then the intensification of instruction—whether through special education or Title I or improvements to regular classroom teaching should be provided. Quality and intensity are what need to change; not placements. Early literacy is an amalgam of foundational skills that allow one to decode from print to language and language skills that allow one to interpret such language. If students are reaching average levels of performance on foundational skills, it is evident that they are attaining skills levels sufficient to allow most students to progress satisfactorily. If they are not progressing, then you need to look at the wider range of skills needed to read with comprehension. The focus of the instruction, the intensity of the instruction, and the quality of the instruction should be altered when students are struggling; the program placement or labels, not so much.
Sunday, July 21, 2013
Sunday, April 7, 2013
Reading comprehension tests are not goals… they are samples of behaviors that represent the goals… and they are useful right up until teachers can’t distinguish them from the goals themselves.
Saturday, October 31, 2009
Teachers there told me they were testing some kids weekly or biweekly. That is too much. How do I know it is too much?
The answer depends on two variables: the standard error of measurement of the test you are using and the student growth rate. The more certain of the scores (the lower the SEM that is), the more often you could profitably test... And the faster that students improve on the measure relative to the SEM of the test, the more often you can test.
On something like DIBELS, the standard errors are reasonably large compared to the actual student growth rate--thus, on that kind of measure it doesn't make sense to measure growth more than 2-3 times per year. Any more than that, and you won't find out anything about the child (just the test).
The example in my powerpoint below is based on the DIBELS oral reading fluency measure. For that test, kids read a couple of brief passages (1 minute each) and a score is obtained in words correct per minute. Kids in first and second grade make about 1 word improvement per week on that kind of measure.
However, studies reveal this test has a standard error of measurement of 4 to 18 words... that means, under the best circumstances, say the student scores 50 wcpm on the test, then we can be 68 percent certain that this score is someplace between 46 and 54 (under the best conditions when there is a small SEM). That means that it will be at least 4 weeks before we would be able to know whether the child was actually improving. Sooner than that, and any gains or losses that we see will likely be due to the standard error (the normal bouncing around of test scores).
And that is the best of circumstances. As kids grow older, their growth rates decline. Older kids usually improve 1 word every three or four weeks weeks on DIBELS. In those cases, you would not be able to discern anything new for several months. But remember, I'm giving this only 68 percent confidence, not 95 percent, and I am assuming that DIBELS has the smallest SEMs possible (not likely under normal school conditions). Two or three testings per year is all that will be useful under most circumstances.
More frequent testing might seem rigorous, but it is time wasting, misinformative, and simply cannot provide any useful information for monitoring kids learning. Let's not just look highly committed and ethical by testing frequently; let's be highly committed and ethical and avoid unnecessary and potentially damaging testing.
Here is the whole presentation.
Monday, September 28, 2009
Which ways of indicating book difficulty work best?
This question came up because the inquirer wondered if it mattered whether she used Lexiles, Reading Recovery, or Fountas and Pinnell levels. The various responses suggested a whiff of bias against Lexiles (or, actually, against traditional measures of readability including Lexiles).
So are all the measures of book difficulty the same? Well, they are and they’re not. It is certainly true that historically most measures of readability (including Lexiles) come down to two measurements: word difficulty measure and sentence difficulty. These factors are weighted and combined to predict some criterion. Although Lexiles include the same components as traditional readability formulas, they predict different criteria. Lexiles are lined up with an extensive database of test performance, while most previous formulas predict the levels of subjectively sequenced passages. Also, Lexiles have been more recently normed. One person pointed out that Lexiles and other traditional measures of readability tend to come out the same (correlations of .77), which I think is correct, but because of the use of recent student reading as the criterion, I usually go with the Lexiles if there is much difference in an estimate.
Over the years, researchers have challenged readability because it is such a gross index of difficulty (obviously there is more to difficulty than sentences and words), but theoretically sound descriptions of text difficulty (such as those of Walter Kintsch and Arthur Graesser) haven’t led to appreciably better text difficulty estimates. Readability usually explains about 50% of the variation in text difficulty, and these more thorough and cumbersome measures don’t do much better.
One does see a lot of Fountas and Pinnell and Reading Recovery levels these days. Readability estimates are usually only accurate within about a year, and that is not precise enough to help a first-grade teacher to match her kids with books. So these schemes claim to make finer distinctions in text difficulty early on, but these levels of accuracy are open to question (I only know of one study of this and it was moderately positive), and there is no evidence that using such fine levels of distinction actually matter in student learning (there is some evidence of this with more traditional measures of readability).
If anything, I think these new schemes tend to put kids into too many levels and more than necessary. They probably correlate reasonably well with readability estimates, and their finer-grained results probably are useful for early first grade, but I’d hard pressed to say they are better than Lexiles or other readability formulas even at these levels (and they probably lead to over grouping).
Why does readability work so poorly for this?
I’m not sure that it really does work poorly despite the bias evident in the discussion. If you buy the notion that reading comprehension is a product of the interaction between the reader and the text (as most reading scholars do), why would you expect text measures to measure much more than half the variance in comprehension? In the early days of readability formula design, lots of text measures were used, but those fell away as it became apparent that they were redundant and 2-3 measures would be sufficient. The rest of the variation is variation in children’s interests and knowledge of topics and the like (and in our ability to measure student reading levels).
Is the right level the one that students will comprehend best at?
One of the listserv participants wrote that the only point to all of this leveling was to get students into texts that they could understand. I think that is a mistake. Often that may be the reason for using readability, but that isn’t what teachers need to do necessarily. What a teacher wants to know is “at what level will a child make optimum learning gains in my class?” If the child will learn better from something hard to comprehend, then, of course, we’d rather have them in that book.
The studies on this are interesting in that they suggest that sometimes you want students practicing with challenging text that may seem too hard (like during oral reading fluency practice) and other times you want them practicing with materials that are somewhat easier (like when you are teaching reading comprehension). That means we don’t necessarily want kids only reading books at one level: we should do something very different with a guided reading group that will discuss a story, and a paired reading activity in which kids are doing repeated reading, and an independent reading recommendation for what a child might enjoy reading at home.
But isn’t this just a waste of time if it is this complicated?
I don’t think it is a waste of time. The research certainly supports the idea that students do better with some adjustment and book matching than they do when they work whole class on the same level with everybody else.
However, the limitations in testing kids and testing texts should give one pause. It is important to see such data as a starting point only. By all means, test kids and use measures like Lexiles to make the best matches that you can. But don’t end up with too many groups (meaning that some kids will intentionally be placed in harder or easier materials than you might prefer), move kids if a placement turns out to be easier or harder on a daily basis than the data predicted, and find ways to give kids experiences with varied levels of texts (from easy to challenging). Even when a student is well placed, there will still be selections that turn out to be too hard or too easy, and adjusting the amount of scaffolding and support needed is necessary. That means that teachers need to pay attention to how kids are doing, and responding to these needs to make sure the student makes progress (i.e., improves in what we are trying to teach).
If you want to know more about this kind of thing, I have added a book to my recommended list (at the right here). It is a book by Heidi Mesmer on how to match texts with kids. Good luck.
Thursday, August 20, 2009
I visited schools yesterday that used to DIBEL. You know what I mean, the teachers used to give kids the DIBELS assessments to determine how they were doing in fluency, decoding, and phonemic awareness. DIBELS has been controversial among some reading experts, but I’ve always been supportive of such measures (including PALS, TPRI, Ames-web, etc.). I like that they can be given quickly to provide a snapshot of where kids are.
I was disappointed that they dropped the tests and asked why. “Too much time,” they told me, and when I heard their story I could see why. This was a district that like the idea of such testing, but their consultants had pressured them into repeating it every week for at risk kids. I guess the consultants were trying to be rigorous, but eventually the schools gave up on it altogether.
The problem isn’t the test, but the silly testing policies. Too many schools are doing weekly or biweekly testing and it just doesn’t make any sense. It’s as foolish as checking your stock portfolio everyday or climbing on the scale daily during a diet. Experts in those fields understand that too much assessment can do harm, so they advise against it.
Frequent testing is misleading and it leads to bad decisions. Investment gurus, for example, suggest that you look at your portfolio only every few months. Too many investors look at a day’s stock losses and sell in a panic, because they don’t understand that such losses happen often—and that long term such losses mean nothing. The same kind of thing happens with dieting. You weigh yourself and see that you’re down 2 pounds, so what the heck, you can afford to eat that slice of chocolate cake. But your weight varies through the day as you work through the nutrition cycle (you don’t weigh 130, but someplace between 127 and 133). So, when your weight jumps from 130 to 128, you think “bring on the desert” when you real weight hsn't actually changed since yesterday.
The same kind of thing happens with DIBELS. Researchers investigated the standard error of measurement (SEM) of tests like DIBELS (Poncy, Skinner, & Axtell, 2005 in the Journal of Psychoeducational Measurement) and found standard errors of 4 to 18 points with oral reading fluency. That’s the amount that the test scores jump around. They found that you could reduce the standard error by testing with multiple passages (something that DIBELS recommends, but most schools ignore). But, testing with multiple passages only got the SEM down to 4 to 12 points.
What does that mean? Well, for example, second graders improve in words correct per minute (WCPM) in oral reading about 1 word per week. That means it would take 4 to 12 weeks of average growth for the youngster to improve more than a standard error of measurement.
If you test Bobby at the beginning of second grade and he gets a 65 wcpm in oral reading, then you test him a week later and he has a 70, has his score improved? That looks like a lot of growth, but it is within a standard error so it is probably just test noise. If you test him again in week 3, he might get a 68, and week 4 he could reach 70 again, and so on. Has his reading improved, declined, or stagnated? Frankly, you can’t tell in this time frame because on average a second grader will improve about 3 words in that time, but the test doesn’t have the precision to identify reliably a 3-point gain. The scores could be changing because of Bobby’s learning, or because of the imprecision of the measurement. You simply can't tell.
Stop the madness. Let’s wait 3 or 4 months, still a little quick, perhaps, but since we use multiple passages to estimate reading levels ,it is probably is okay. In that time frame, Bobby should gain about 12-16 words correct per minute if everything is on track. If the new testing reveals gains that are much lower than that, then we can be sure there is a problem, and we can make some adjustment to instruction. Testing more often can’t help, but it might hurt!