Recently, I saw results of a meta-analysis that showed phonics instruction to have a much smaller effect size (.19) than many other approaches to reading instruction. Doesn’t that mean that we are overdoing phonics? If we want to improve reading comprehension it looks like it would make more sense to emphasize motivation, fluency, and inferencing than teaching phonics.
RELATED: Which Reading Model Would Best Guide Our School Improvement Efforts?
In 1986, Gough and Tunmer presented a model indicating that reading comprehension was the product of decoding ability (the ability to translate written or printed text into oral language – that is, the skills that would allow someone to read a text aloud) and language comprehension ability (listening comprehension which would allow an understanding of that oral rendition of text).
According to this so-called “simple view,” reading comprehension could be completely explained by those two sets of abilities – decoding and language comprehension.
Over time, data have accumulated supporting the key roles of both decoding and language in reading (Hoover & Tunmer, 2021; Sleeman, Everatt, Arrow, & Denston, (2022), and indicating diagnostic and pedagogical benefits to the scheme.
Nevertheless, the theory tends to break down around the edges.
Oral language and written language operate somewhat differently (Daniels & Bright, 1996) – complicating the idea that reading comprehension is no more than a listening skills applied to text. There are vocabulary words that appear often in text, but rarely in oral language (e.g., occur, peruse, enumerate, venerate). Likewise, written sentence complexity often outstrips what we confront orally. Reading is like oral language, but mostly when we are being lectured – think about the sustained attention and memory demands of listening to an extended monologue. Oral language usually tends more to dialogue, reading to monologue. Also, oral language tends to allow for interaction between speaker and listener; not so much in reading (Olson, 1994). Treating oral language development as the sole basis of reading comprehension would fall woefully short.
The simple view is not especially specific about the skills included in either of those two constellations. Does phonemic awareness belong in decoding? What about the roles of reasoning and knowledge in listening comprehension? How do I know if I’m omitting a critical part of decoding or language?
Another complaint is that the model makes it look like decoding and language are equivalent – in what it takes to learn them, in their developmental horizons, and so on (Catts, 2018). Many children, perhaps most, can gain full benefits of decoding instruction during the first two or three years of school. Who would claim this to be true of language development?
Perhaps most damning is that statistical analyses of reading can't account for all the variation in reading ability with those two sets of variables alone (Wagner, Beal, Zirps, & Spencer, 2021). In fact, according to that rigorous analysis, the simple view only accounts for a bit more than half the reading variation – suggesting the need for additional variables or different ways of measuring the variables already identified.
In response to these limitations, Duke and Cartwright (2021) have advanced a more elaborate model of reading. Their Active View Model is more specific about what goes in those word reading and language comprehension bubbles. With their model you don’t need to guess about that. Enumerating those items complicates things, a bit, and evidentiary support for individual items is pretty uneven. Some of the variables have a great deal of research support, others not so much (as of yet, anyway).
Duke and Cartwright also have included domains not part of the simple view. For instance, their model includes an Executive Function bubble that oversees word reading and comprehension. Another new category holds variables that don’t fit neatly into either word reading or language. For example, research has found that vocabulary plays important roles both in decoding and comprehension. Two-headed abilities like that populate a “bridge variables” constellation.
Just as one can marshal evidence that both decoding and language comprehension are important parts of reading – one can provide similar evidence for the active view variables.
The study that you noted (Burns, Duke, & Cartwright, 2023) was such an attempt.
These researchers examined relevant meta-analyses reported since 2006 – a rather arbitrary cut point (and one particularly unfortunate for the decoding variables). Doing it that way ensures that the largest body of research on elementary phonics instruction (the National Reading Panel Report) would be omitted from consideration.
If this study was aimed at understanding the impacts of phonics instruction, this approach would likely have been shot down by reviewers. A major concern with meta-analysis is sampling error. Ignoring a major corpus of data without persuasive theoretical and/or methodological reasons would be unacceptable.
However, their purpose was not to be comprehensive or even to suggest the relative importance of the variables in the model. They simply wanted to demonstrate that each of the constellations was supported by some empirical evidence. If all the phonics studies were included, the overall effect size might have been a bit bigger – it certainly wouldn’t have been lower. But the absence of those data wouldn’t alter the point that the major domains included in the active view are supported by evidence; that would be also be true if the phonics effect size had turned out to be much larger.
This study accomplished its goals – it showed that the active view provides an efficient and coherent compendium of reading abilities (at least in terms of those major domains.
Perhaps this model will generate useful research or curriculum development going forward. But remember it’s just a model, and a partial one at that. This model is more complete than the simple view and does a better job of accommodating some of the knowledge about reading that has been developed over the past several decades. But it doesn’t suggest anything about how these variables fit together, how their relative importance changes with development, or many other issues relevant to reading instruction.
Other reasons not to be overly concerned about the relatively low phonics effect size in this study?:
1. The study put forth two phonics effect sizes: the one you noted for average readers, and one for striving readers. That second effect size, the one for the strivers, was .48. That put phonics in the top tier of interventions for kids who struggle with reading. That effect size is based on 32 independent studies (the .19 was based on only 8), and remember, these effects were in terms of impact on reading comprehension or overall reading achievement – not on decoding.
2. The reporting of this study poses some important challenges to scholars, since it is difficult to identify which studies contributed to these main effects estimates. Usually in a meta-analysis, the studies are chosen because they provide data about the effect of a particular variable or approach. In this case, there are 12 variables for which main effects are reported based on data from 27 meta-analyses. But there is no linkage between the studies and the outcomes. That makes it almost impossible to evaluate the appropriateness of the analyses for any of the variables.
3. An example, of the kind of further analysis that would be needed to evaluate a specific statistic like effect size for phonics instruction is posed by the Galuschka et al. (2014) meta-analysis. Galuschka combined the effects of studies that I would think of as learning trials, rather than efforts to raise general reading comprehension or overall reading achievement. Some of the phonics studies included in that meta considered instruction in which the “phonics” entailed no more than 4 half-hour lessons in which students memorized 25 two-letter syllables each day. I couldn’t figure out how any of the studies in that meta-analysis fit the purpose or selection standards of this Burns et al. study. My concerns about the inclusion of that odd meta-analysis doesn’t alter my overall estimate of the value of the Burns study, but it reveals why I wouldn’t be overly concerned about a specific effect size being higher or lower than you anticipated given that it isn't clear which data contributed to it.
4. Another example of my concerns about the original meta-analyses that were the basis of this study is presented by the Suggate, 2016 study. My concern about that one is that it focused on long-term benefits of skills (most of the other meta-analyses were more immediate in,focus). Including long-term outcomes for some variables, but not for others presents an an unfortunate confound if the purpose was to compare variables – as it would suppress the relative impact of some variables. This is especially challenging given the odd classifications of original studies in the Suggate meta. For example, several studies conducted by Patricia Vadasy and her colleagues were classified as fluency interventions – not phonics, despite their focus on phonemic awareness, phonics, and code-oriented instruction (not fluency). This apparent misclassification may matter since these studies reported some of the biggest effect sizes in that analysis.
5. Another problem for the Burns et al. study is its failure to focus on interventions that addressed a single issue. Motivation, for example, was rarely if ever a variable on its own. A study included in the motivation set might have taught reading comprehension strategies along with some student choices for books, while the control groups received neither the strategy teaching, those books, or the chance to make choices. Attributing outcomes from such studies to motivation alone is misleading.
6. The developmental nature of reading raises additional concerns. Decoding has been identified as a skill set with a relatively low ceiling. The importance or value of phonics instruction depends upon how well students can decode. Young children are likely to benefit more from phonics than older ones. Struggling readers will usually benefit more from such teaching than average readers, especially with older students. Just comparing effect sizes across very different interventions with very different samples of students cannot provide meaningful relative estimates of importance. (This is also true for vocabulary and fluency development – their value in supporting comprehension changes over time.)
7. Decoding is often described by scientists as a necessary but insufficient condition. That is, you can’t learn to read without learning to decode, but learning to decode will not be sufficient to make you a reader. This is like the food groups in nutrition. No nutritionist would ask, “Which food groups do we need to provide children?” They would recognize it as a trick question – to be healthy kids need all of these food groups, of course – it isn’t a competition between proteins and carbohydrates. In reading, making sure all kids reach threshold levels of decoding ability (Wang, Sabatini, O'Reilly, & Weeks, 2019) should be a non-negotiable -- no matter the relative effect sizes in this kind of rough analysis.
8. The simple view is unable to account for all the variance in reading ability, which makes the identification of a more complete model a worthwhile pursuit. The active view model seems to provide greater completeness. However, this first attempt to quantify the additional power that this model provides for explaining variation in reading attainment is not convincing. The new model with its new domains and it additional variables was only able to pick up an additional 2% of variance. This 2% was statistically significant, but I am dubious as to its eduational importance. Given the problems with this analysis, I suspect the 2% added value is meaningless. That rather modest supposed added value wouldn't convince me to treat decoding or language differently than in the past.
Basically, this study has nothing to say about the relative value of phonics instruction (or of instruction of any of the dozen variables it included).
READ MORE: Shanahan On Literacy Blog
Burns, M. K., Duke, N. K., & Cartwright, K. B. (2023). Evaluating components of the active view of reading as intervention targets: Implications for social justice. School Psychology, 38(1), 30-41. doi:https://doi.org/10.1037/spq0000519
Catts, H. W. (2018). The Simple View of Reading: Advancements and False Impressions. Remedial and Special Education, 39(5), 317–323. https://doi.org/10.1177/0741932518767563
Daniels, P. T., & Bright, W. (Eds.). 1996. The world's writing systems. New York: Oxford University Press.
Duke, N. K., & Cartwright, K. B. (2021). The science of reading progresses: Communicating advances beyond the simple view of reading. Reading Research Quarterly, doi:https://doi.org/10.1002/rrq.411
Gough, P., & Tunmer, W. (1986). Decoding, reading, and reading disability. Remedial and Special Education, 7, 6–10.
Hoover, W. A., & Tunmer, W. E. (2021). The primacy of science in communicating advances in the science of reading. Reading Research Quarterly, doi:https://doi.org/10.1002/rrq.446
Olson, D. R. (1994). The world on paper: The conceptual and cognitive implications of writing and reading. Cambridge: Cambridge University Press.
Sleeman, M., Everatt, J., Arrow, A., & Denston, A. (2022). The identification and classification of struggling readers based on the simple view of reading. Dyslexia: An International Journal of Research and Practice, 28(3), 256-275. doi:https://doi.org/10.1002/dys.1719
Wagner, R. K., Beal, B., Zirps, F. A., & Spencer, M. (2021). A model-based meta-analytic examination of specific reading comprehension deficit: How prevalent is it and does the simple view of reading account for it? Annals of Dyslexia, 71(2), 260-281. doi:https://doi.org/10.1007/s11881-021-00232-2
Wang, Z., Sabatini, J., O'Reilly, T., & Weeks, J. (2019). Decoding and reading comprehension: A test of the decoding threshold hypothesis. Journal of Educational Psychology, 111(3), 387-401. doi:https://doi.org/10.1037/edu0000302
Background knowledge can be very important in reading comprehension. Students need to learn how to connect what they know to the information in a text. If students know nothing about a topic (which happens rarely), it can be useful to try to build up some relevant information prior to reading (not information that will be in the text however). Students should be taught to use their knowledge to connect with the text (and they should learn to use other relevant information they might have). Although I might know nothing about bonobo monkeys, I might know something about other monkeys, or primates, or animals.
Thanks for this analysis. I had set out to examine the studies that were used to derive the effect sizes for several of the instructional areas / variables. As you note, the studies used in each area could not be determined.
That is never a problem with a first-order meta-analysis, at least as far as main effects concerned. I don't think journals are doing a good job of setting reporting standards for meta-analyses -- if it is not possible to replicate the study, then it is poor reporting.
Thank you for this analysis. As you note, so importantly, even the expanded SVR models tell us little about how the variables fit together or how their relative importance changes over the course of development and instruction. And, as you also note, it is not a competition. These perspectives became even clearer when, after years of university teaching and research on the struggles of older children in both oral and written language, I worked with struggling 3rd graders face-to-face as a literacy volunteer. The students had some, but not great, phonics skills. They treated reading as an exercise in trying to read words correctly, not as something done for enjoyment or to learn something. Eventually I put together a routine that addressed motivation (through choice of reading material), fluency (some choral and repeated reading), sentence processing (marking periods in text and teaching syntactic chunking via prosody), and word study skills (common spelling patterns and affixes)...and figured out a way to integrate these variables as we read one text in a 45-minute session, repeating the same routine with a new text in each session. Comprehension was addressed by jointly composing a brief summary of the text, which they could see taking form my computer screen. Two of the three children responded very well, essentially catching up with age/grade peers. In another year working with another student (also 3rd grade) but with lower phonics skills, I had to make some adjustments to this routine (more phonics emphasis), but still managed to include text-level applications in each session. Individualization in the context of an appreciation for the many variables of developmental reading is key. Again, as you noted, it definitely is not a competition.
I wonder if the issue at base is that we are relying on parametric statistics but the relationships among the variables are non linear?
Thank You,I`ve been waiting to see you write about this.
This is Ehri`s latest finding,I hope it will show up this way.
Thank you so much for writing this piece,there`s a lot of push back because of the tag line SoR which I`m sure we`ll see eventually is doing a hell of a lot of good in Grade 1.
Dr. Ehri has encouraged starting at the larger unit and skipping syllables and onset rime,same as Brady.
Accurate Pronunciation of the spelled word leads to permanence and automaticity, the bridge to reading comprehension.Not enough talk about that, again so well stressed in NRP.
The competing views lead to serious teacher confusion,I just hope it doesn`t defeat us,again!
Is anyone differentiating for students? They all come with different skills, including our EL learners. Teach what they NEED. Stop wasting time, and our children’s education time. Assess, diagnose, and teach.
Professor Shanahan, thank you for this extensive and informed analysis. That “teacher question” is dang sophisticated. I like that a teacher is asking a question about effect sizes!
I thoroughly agree with your note that if meta-analysts set temporal limits for studies included in their corpus, they risk systematically excluding studies that may have contributed to an consensual-established finding; if 20-30 studies clearly demonstrated the superiority of method R prior to Year N, and the meta-analyst cuts her or his criteria and Year N+1, all 20-30 of those studies are out.
Why not include everything in one’s corpus? Is the meta-analyst worried about the quality of older studies? Code for study quality. Is the meta-analyst worried about sample size? Code for that (using well-documented corrections, as available from texts by Cooper et al.) Is there worry about he outcome measures? It’s possible to code for that, too. All of those concerns can be empirical questions.
On the less technical side, the more philosophical side: To be sure, as much as I admire the “Simple View,” it doesn’t exhaust the contributors to the variance. Motivation (whatever advocates say that is), textual factors, and interactions between decoding and language comprehension (and other interactions) surely influence outcomes. I’m comfortable with asserting that the percentage of the variance controlled by the factors other than decoding and general language will be small.
So, no one should take that particular meta-analysis as suggesting educators should not focus on teaching decoding effectively.
John Wills Lloyd, Ph.D.
UVA Professor Emeritus
Founder & Editor, https://www.SpecialEducationToday.com/
Co-editor, Exceptional Children
I interviewed Dr. Linda Siegel last year,she mentioned her concern around ORF.
She told us that at the beginning,it was fa more important to focus on accuracy while the child builds up his confidence with Decoding.
If you can`t answer that here,could you address it some other time?
The technology movement has sent a frenzy of testing into the schools and these nuances are often missing.
We deal with weak readers and catching up so I truly believe in this circumstance she`s 100 percent correct but what about for the other kids,we know Fluency is really important but what about Fluency exercises in a vacuum where the orthographic instruction is missing?
Some teachers are developing some degree of sophistication about things like effect sizes -- and these data were made available on Twitter. However, we should not overestimate that sophistication. I only heard directly from 3 practitioners about this study -- one from the east coast, one the west, and one I don't know where he/she was from.
This is always a problem with translating research to practice in education. A meta-analysis comes out suggesting that phonics is foundational and especially impactful for at-risk students who are struggling to learn to read, and people use that analysis to dismiss phonics because of one relatively low outcome variable. Just because something has a lower effect size doesn't mean it isn't necessary -- for about 60% of students, we know they benefit from some direct instruction in phonics. AND we also know that OVER-teaching phonics has diminishing returns over time. Like any treatment, you want to give it to the people who need it most, in the dose that is most effective.
Thank you for the clarification. I think so often, we hear some findings from a meta-analysis and jump to conclusions about instruction based on our biases perhaps. The reality is that we don't look at the specifics of the studies, such; variables at play, sample size and if we're looking at correlations. Also, we fail to remember that even correlations do not mean causality. I appreciate your putting the findings of both studies in layman's terms.
Thank you, Dr. Shanahan~ I'm wondering about the factor of background knowledge and how much that plays a role in students' reading comprehension. My experience as a teacher was my 4th grade students could parrot the words but didn't have the schema to understand the meaning of the text.
Thank you for your analysis; the current examination of reading variables is missing many facets, significant among them is the peculiar functions of knowledge of vowel sound/symbol correspondences and accurate perception.
In my master’s degree program, I studied the literature (the first science of reading, as it were) which resulted from the rigorous research in the science of language communication and in learning to read begun in 1965 under the auspices of the National Institute of Health and Human Development (NICHD) and the long-term, prospective, longitudinal, and multidisciplinary research, added in 1985 with the Health Research Extension Act. I was able to use what I learned in crafting effective instruction in my private practice with colleagues as well as for fellow teachers in the form of a manual and the website, TheSpellofLanguage.com.
I searched for evidence of difficulties posed by vowels when I realized that students who acquired vowel knowledge and perception assumed improved trajectories in language learning. In 1971, Donald Shankweiler and Isabelle Liberman studied linguistic aspects of error patterns in reading consonants and vowels. Vowels, unlike consonants, were heard correctly but frequently misread. One consideration at that time was that vowel errors occurred because of phonological confusions. Results of their study were recounted in "Misreading: A Search for Causes", a chapter in Language By Ear and By Eye, published in 1972.
When Shankweiler and Liberman discussed vowel errors again in May 1976, in an article entitled, “Speech, the Alphabet, and Teaching to Read," they made no further mention of perceptual difficulty and, instead, attributed vowel difficulty to complexity of orthography. Nevertheless, when Yolanda Post researched the topic in 1987, she found evidence of perceptual confusion. [et al. “Identification of Vowel Speech Sounds by Skilled and Less Skilled Readers and the Relation with Vowel Spelling.” Merrill-Palmer Quarterly 33 (1987)]. Her “finding” appears to have led nowhere.
Yet, for virtually all with whom my colleagues and I have worked, whether referred as young students with learning difficulties, as ELL students, or as older high-achieving students reading inefficiently, we have found that the gains derived from vowel mastery were comparable to those achieved in learning to apply the alphabetic principle. In a three-year project in a New Hampshire elementary school, we collected data showing that vowel sound-spelling correspondences are less easily learned and maintained than those of consonants. We also found that increases in vowel knowledge correlate with improved word learning in whole-class instruction across grade levels. I have been unable to interest researchers in investigating the peculiarities of vowel sounds that would account for what we observe.
The importance of other facets of written-language acquisition are also obfuscated by the continual reconstitution of the debate featuring the value of phonics. We have been able to demonstrate that, to quote Emily Hanford, we can “get the core instructional program right.”
Jean C. Tucker, M.Ed, CCC-SLP
In fact, the evidence for phonics over alternative methods is quite weak once you consider various methodological problems with various meta-analyses, including the NRP. In the case of the NRP, the meta-analysis includes many non-RCT studies, the quality of the phonics studies included in the NRP are often quite low, there is evidence for publication bias for the phonics RCT studies, and the meta-analysis does not provide any support for the conclusion that is commonly drawn from the NPR, namely it provides evidence that SSP is better than whole language. In fact, only 4 studies compared SSP to whole language, with one positive result, one inhibitory result, and essentially 2 null results (very small positive results, with d of .12 and .07). There is also no assessment of long-term effects on spelling, reading texts, or reading comprehension. It is also the case that this meta-analysis almost 25 years old, and many subsequent meta-analyses show weaker effects (and have many of the same flaws, including the wrong control condition if you want to claim that phonics is better than whole language -- and this is not an endorsement for whole language). I know I sound like a broken record, but I continue to be struck by the failure of researchers to address the problems I identified and published in one of the top journals in education.
Jeffrey S. Bowers (2020) Reconsidering the Evidence that Systematic Phonics is more Effective than Alternative Methods of Reading Instruction. Educational Psychology Review, 32, 681-705.
Jeffrey S. Bowers (2021). Yes children need to learn their GPCs but there really is little or no evidence that systematic or explicit phonics is effective: A response to Fletcher, Savage, and Sharon (2020). Educational Psychology Review. doi.org/10.1007/s10648-021-09602-z
Never mind there is little or no evidence that almost 20 years of legally mandated phonics instructions in English schools has improved reading outcomes. It would be great if you would devote a blogpost to this work – what are the problem with my arguments?
For anyone interested in learning more go to my website where you can access all my published work: https://jeffbowers.blogs.bristol.ac.uk/publications/education-literacy/.
And for easier introduction to my work, check out my blogpost: https://jeffbowers.blogs.bristol.ac.uk/blog/
Leave me a comment and I would like to have a discussion with you!
What about the new research that says phonics instruction isn’t very important?15 comments
Copyright © 2023 Shanahan on Literacy. All rights reserved. Web Development by Dog and Rooster, Inc.
See what others have to say about this topic.