When I was working on my doctorate, I had to conduct a historical study for one of my classes. I went to the Library of Congress and calculated readabilities for books that had been used to teach reading in the U.S. (or in the colonies that became the U.S.). I started with the Protestant Tutor and the New England Primer, the first books used for reading instruction here. From there I examined Webster’s Blue-Backed Speller and its related volumes and the early editions of McGuffey’s Readers.
Though the authors of those have left no record of how those books were created, it is evident that they had sound intuitions as to what makes text challenging. Even in the relatively brief single volume Tutor and Primer, the materials got progressively more difficult from beginning to end. These earliest books ramped up in difficulty very quickly (you read the alphabet on one-page, simple syllables on the next, which was followed by a relatively easy read, but then challenge levels would jump markedly).
By the time we get to the three-volume Webster, the readability levels adjust more slowly from book to book with the speller (the first volume) being by far the easiest, and the final book (packed with political speeches and the like) being all but unreadable (kind of like political speeches today).
By the 1920s, psychologists began searching for measurement tools that would allow them to describe the readability or comprehensibility of texts. In other words, they wanted to turn these intelligent intuitions about text difficulty into tools that anyone could use. That work has proceeded by fits and starts over the past century, and has resulted in the development of a plethora of readability measurements.
Readability research has usually focused on the reading comprehension outcome. Thus, they have readers do something with a bunch of texts (e.g., answer questions, do maze/cloze tasks) and then they try to predict these performance levels by counting easy to measure characteristics of the texts (words and sentences). The idea is to use easily measured or counted text features and to then place the texts on a scale from easy to hard that agrees with how readers did with the texts.
Educators stretched this idea of readability to one of learnability. Instead of trying to predict how well readers would understand a text, educators wanted to use readability to predict how well students would learn from such texts. Thus, the idea of “instructional level”: if you teach students with books that appropriately matched their reading levels, the idea was that students would learn more. If you placed them in materials that were relatively easier or harder, there would be less learning. This theory has not held up very well when empirically tested. Students seem to be able to learn from a pretty wide range of text difficulties, depending on the amount of teacher support.
The Common Core State Standards (CCSS) did not buy into the instructional level idea. Instead of accepting the claim that students needed to be taught at “their levels,” the CCSS recognizes that students will never reach the needed levels by the end of high school unless harder texts were used for teaching; not only harder in terms of students’ instructional levels, but harder also in terms of which texts are assigned to which grade levels. Thus, for Grades 2-12, CCSS assigned higher Lexile levels to each grade than in the past (the so-called stretch bands).
Lexiles is a recent scheme for measuring readability. Initially, it was the only readability measure accepted by the Common Core. That is no longer the case. CCSS now provides text guidance for how to match books to grade level using several formulas. This change does not take us back to using easier texts for each grade level. Nor does it back down from encouraging teachers to work with students at levels higher than their so-called instructional levels. It does mean that it will be easier for schools to identify appropriate texts using and of six different approaches—many of which are already widely used by schools.
Of course, there are many other schemes that could have been included by CCSS (there are at least a couple of hundred readability formulas). Why aren’t they included? Will they be going forward?
From looking at what was included, it appears to me that CCSS omitted two kinds of measures. First, they omitted those schemes that have not used often (few publishers still use Dale-Chall or the Fry Graph to specify text difficulties, so there would be little benefit in connecting them to the CCSS plan). Second, they omitted widely used measures that were not derived from empirical study (Reading Recovery levels, Fountas & Pinnell levels, etc.). Such levels are not necessarily wrong—remember educators have intuitively identified text challenge levels for hundreds of years.
These schemes are especially interesting for the earliest reading levels (CCSS provides no guidance for K and 1). For the time being, it makes sense to continue to use such approaches for sorting out the difficulty of beginning reading texts, but then to switch to approaches that have been tested empirically in grades 2 through 12. [There is very interesting research underway on beginning reading texts involving Freddie Hiebert and the Lexile people. Perhaps in the not-too-distant future we will have stronger sources of information on beginning texts].
Here is the new chart for identifying text difficulties for different grade levels:
For more information:
Copyright © 2023 Shanahan on Literacy. All rights reserved. Web Development by Dog and Rooster, Inc.