Do Screening and Monitoring Tests Really Help?

  • assessment
  • 05 March, 2022

Teacher Question:

I’m surprised that you don’t write about screening and monitoring tests. I’ve been a teacher for 24 years (first-grade) and I’m considering an early retirement. It seems like I’m supposed to test my students more than teach them. We just test and get ready for tests. I feel so sorry for the boys and girls. I want to teach reading, not FSF, ISF, PSF. NWF, WRF, ORF, LNF. Help me, Dr. Shanahan. Is this what’s best? Is this really a part of the “science of reading”?

Shanahan response:

I feel your pain. When I was a first-grade teacher, we didn’t have all those tests. It certainly made my life easier.

But did that lack make children’s education any better? That’s the real question.

Over the past couple decades such testing has insinuated itself into many primary grade classrooms, often because of policy mandates from above.

Twenty years ago, Reading First (a part of the “No Child Left Behind” federal legislation) was the main source of these testing mandates. These days it’s more likely to be the state Dyslexia Screening laws. Many school districts have taken up these tests on their own, as well.

No wonder. The scheme makes sense. There’s a logic to it.

We test kids at the beginning of the school year to find out which essential skills they have yet to develop. As the school years proceeds, we then re-evaluate to see how the kids are progressing. Teachers, based on the tests, are to differentiate and provide reteaching, and some kids may get extra help through instructional interventions beyond the classroom. The idea is commendable because it strives to keep students from falling behind.

What does research have to say about this approach?

First, there are the studies of the tests themselves. These days, there is a slew of such measures that probe into proficiency with letters, phonemic awareness, decoding, oral reading fluency, and spelling (curriculum-based measures) or that try to predict later reading performance. Studies show that many of these short tests are both valid and reliable. Not that research hasn’t identified important limitations, too – including their shortfall with some populations, such as English Learners (Newell, Codding, & Fortune, 2020); reliability problems when administered by teachers under normal classroom conditions (Ardoin & Christ, 2008); and the so-called false positives issue which often leads to the overidentification of reading problems – meaning some kids get extra instruction when they don’t need it (Compton, Fuchs, Fuchs, Bouton, Gilbert, Barquero, Cho, & Crouch, 2010).

That means such tests can do a good job, but that they sometimes don’t. Still, even with these snags, it appears that they are up to the job for which they are intended (January & Klingbell, 2020; Petscher, Fien, Stanley, Gearin, Gaab, Fletcher, & Johnson, 2019).

So far, so good.

Second, there are studies of various early interventions. Again, there has been substantial study into whether remedial interventions help kids to progress. Several programs that deliver targeted instruction to low readers have been found to be successful.

That’s even better.

Given that there are valid tests and effective interventions out there, you’d think there would be strong evidence supporting programs of early identification and differentiation of instruction.  

That’s where things get complicated.

Because, in fact, the evidence supporting the use of such testing to improve reading achievement is neither strong nor straightforward. The pieces are there, but the connections are a bit shaky.

You don’t have to take my word for it.

The What Works Clearinghouse (WWC) issued a relevant practice guide, Assisting Students Struggling with Reading: Response to Intervention (RtI) and Multi-Tier Intervention in the Primary Grades (Gersten, Compton, Connor, Dimino, Santoro, Linan-Thompson, & Tilly, 2008).

That guide recommended that students be screened and monitored in reading. The WWC (part of the research arm of the U.S. Department of Education) evaluated that recommendation and concluded that it was supported by moderate research evidence. Studies showed that such testing could be implemented successfully.

The panel also recommended that, based upon the data from these tests, students should be provided with differentiated reading instruction. The WWC concluded that there was minimal research evidence supporting this recommendation (at that time, they could only cite a single correlational study that suggested the possibility of effectiveness). In other words, there wasn’t convincing proof that teaching in response to testing improved student learning.

Don’t bail yet – that was almost 15 years ago, and things do change.

Given that I started looking for more recent evidence. In some states, these policies have been in place for quite a while, so maybe public data could provide some clues. Also, what about new studies over the past decade?  

Unfortunately, public data hasn’t been especially informative. Since 2006, National Assessment (NAEP) scores have languished (and even fallen a bit recently). But I could find no analyses that linked implementation of these testing policies to reading performance in the various states. Likewise, as far as I could determine, no state has even bothered to monitor whether these laws are helping kids to learn better. (When I’m contacted by groups wanting my help in getting their states to adopt early reading assessment policies, I always ask how it has gone in states that have those policies already. I have yet to find someone who had any idea.)

The largest study of Response to Intervention (RtI) or Multi-Tiered Response efforts – early assessment and intervention is a big part of those – wasn’t encouraging either (Balu, 2015). That national study compared learning results of schools that had such programs with those that didn’t. Startlingly, first graders did worse in the test-and-differentiate schools than in the business-as-usual schools. You can read too much into that result given some gaps in the study. Nevertheless, those results aren’t exactly a glowing endorsement of the instructional practices that you’re finding oppressive.

I polled some colleagues who are big fans of these screening/monitoring assessments. They steered me to the studies they cite in their presentations and publications. I looked. There were some terrific studies that provided strong supporting evidence for the test-and-differentiate idea (Carlson, Borman, & Robinson, 2011; van Geel, Keuning, Visscher, & Fox, 2016; Stecker & Fox, 2000), but they weren’t reading studies. The best evidence on this approach comes from math, a different thing altogether. The Carlson study considered both reading and math, but only reported significant positive results on the math side. Oops.

That’s frustrating.

There has been some recent academic research that has been more supportive, however.

For instance, one study found that assessment-based differentiated reading instruction in Grade 3 had a positive impact on fluency, but not on reading comprehension (Forster, Kawohl, & Souvigneir, 2018). The fluency gains were stable over two years. The lowest readers gained the major benefits of the practice, and teachers needed significant support to make it work (including special instructional materials). The researchers concluded that providing test data to teachers alone was not an effective approach, and they reinforced this claim with conclusions drawn by other researchers from other studies. For instance, Lynn Fuchs and Sharon Vaughn – big supporters of the early assessment approach – concluded that “differentiated instruction is beyond the skill set of even the most proficient teachers” (2012, p. 198). So, at least some positive results.

More persuasive evidence was provided by several studies reported by Connor and her colleagues (Connor, Morrison, Fishman, Crowe, Al Otaiba, & Schatschneider, 2013; Connor, Morrison, Fishman, Giuliani, Luck, Underwood, et al., 2011; Connor, Phillips, Young-Suk, Lonigan, Kaschak, Crowe, Dombek, & Al Otaiba, 2018; Connor, Piasta, Fishman, Glasney, Schatschneider, Crowe, et al., 2009). They found that they could successfully raise first and second grade reading achievement through assess-and-differentiate efforts; identifying who needed more decoding tuition and then keeping those kids under close teacher supervision so that they would progress in phonics (while providing the more advanced students with independent reading work and experience).

As powerful and persuasive as the Connor data are – and they are persuasive to me – it is important to note that this team did much more than turn test data over to teachers and hope for the best. No, they developed a proprietary algorithm that they use to determine the appropriate data-based response to student needs. “Taking these results together indicates that predicting appropriate amounts and types of instruction is not as straightforward as has been previously suggested” (Connor, et al., 2009, p. 93). In fact, they concluded that without their algorithmically based approach some students would likely receive too much decoding instruction, while others would certainly receive too little. That means that early testing can have positive learning outcomes, but only if the results of those tests are weighed appropriately, not something easy for individual teachers to do.

Finally, there is a recent meta-analysis of 15 studies of reading interventions with a “data-based decision making” component and their effects on struggling readers in grades K-12 (Filderman, Toste, Didion, Peng, & Clemens, 2018). The effect sizes for these interventions were small but significant. Six of the studies allowed for comparisons of interventions with and without data-based decision making (again, with small positive effects for using the assessments as the basis of teaching).

My conclusions, from all this evidence, is that it is possible to make effective the kind of assessment that you are complaining about. However, it should also be evident that such efforts too often fail to deliver on those promises.

One of the problems is that there is simply too much testing – especially for the students who aren’t low achieving in reading (VanDerHeyden, Burns, & Bonifay, 2018). That WWC Practice Guide referred to earlier called for three testings per year, but in many jurisdictions, kids are getting far more than that – and the frequency of testing is not necessarily linked to any need for information – if you know a youngster is struggling with phonemic awareness, why not just teach more of that rather than testing the student over and over?

Another problem is that teachers can find it challenging to administer so many tests under classroom conditions. Not only does it undercut the amount of instruction, but it can be tough to provide a valid assessment of phonemic awareness or oral reading fluency when students or teachers struggle to hear each other. In my experience, the best data is produced when test administrators are brought in to take on this burden. I know I trust such data more (and so does the Institute for Education Science in the research studies that they support).

Finally, translating test data into properly and productively differentiated instruction is not the no-brainer that policymakers and school administrators seem to presume. They budget for the tests, and then  provide little or no professional development, guidance, or material supports to make these efforts effective (and Dyslexia Screening laws don’t address what it takes to make these laws work either).

My opinion? Your school is trying to go in the right direction. Help them. Screening and monitoring kids’ early literacy skills can be worthwhile.

The amount of screening and monitoring testing needs to be strictly limited, however.

In many schools/districts/states, we are overdoing it! The only reason to test someone is to find out something that you don’t know. If you know students are struggling with decoding, testing them to prove it doesn’t add much.

The point of all this testing is to reshape your teaching to ensure that kids learn. Unfortunately, these heavy investments in assessment aren’t always (or even usually) accompanied by similar exertions in the differentiation arena.

Talk to your principal, or your district’s curriculum or special education administrators. Request professional development – with classroom demonstrations, in-class coaching, and joint planning – to help get your head back in the game. The reason you became a teacher, I bet, was that you wanted to help kids. This testing could be part of that, but you can’t do that without support. I bet your colleagues would benefit from that, too. 


Ardoin, S.P., & Christ, T.J. (2008). Evaluating curriculum-based measurement slope estimates using data from triannual universal screenings. School Psychology Review, 37(1), 109-125.

Balu, R., Pei, Z., Doolittle, F., Schiller, E., Jenkins, J., & Gersten, R. (2015). Evaluation of Response to Intervention practices for elementary school reading (NCEE  2016-4000). Washington, DC:  National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.

Carlson, D., Borman, G. D., & Robinson, M. (2011). A multistate district-level cluster randomized trial of the impact of data-driven reform on reading and mathematics achievement. Educational Evaluation and Policy Analysis, 33(3), 378–398. doi:10.3102/0162373711412765

Compton, D.K., Fuchs, D., Fuchs, L.S., Bouton, B., Gilbert, J.K., Barquero, L.A., Cho, E, & Crouch, R.C. (2010). Selecting at-risk first-grade readers for early intervention: Eliminating false positives and exploring the promise of a two-stage gated screening process. Journal of Educational Psychology, 102(2), 327-340.

Connor, C. M., Morrison, F. J., Fishman, B., Crowe, E. C., Al Otaiba, S., & Schatschneider, C. (2013). A longitudinal cluster-randomized controlled study on the accumulating effects of individualized literacy instruction on students' reading from first through third grade. Psychological Science, 24, 1408–1419.

Connor, C. M., Morrison, F. J., Fishman, B., Giuliani, S., Luck, M., Underwood, P. S., et al. (2011). Testing the impact of child characteristics X instruction interactions on third graders' reading comprehension by differentiating literacy instruction. Reading Research Quarterly, 46, 189–221.

Connor, C.M., Phillips, B.M., Young-Suk, G.K., Lonigan, C.J., Kaschak, M.P., Crowe, E., Dombek, J., & Al Otaiba, S. (2018). Examining the efficacy of targeted component interventions on language and literacy for third and fourth graders who are at risk of comprehension difficulties. Scientific Studies of Reading, 22(6), 462-484.

Connor, C. M., Piasta, S. B., Fishman, B., Glasney, S., Schatschneider, C., Crowe, E., et al. (2009). Individualizing student instruction precisely: Effects of child X instruction interactions on first graders' literacy development. Child Development, 80, 77–100.

Filderman, M. J., Toste, J. R., Didion, L. A., Peng, P., & Clemens, N. H. (2018). Data-based decision making in reading interventions: A synthesis and meta-analysis of the effects for struggling readers. Journal of Special Education, 52(3), 174–187.

Forster, N., Kawohl, E., & Souvigneir, E. (2018). Short- and long-term effects of assessment-based differentiated reading instruction in general education on reading fluency and reading comprehension. Learning and Instruction, 56, 98-109.

Fuchs, L. S., & Vaughn, S. (2012). Responsiveness-to-Intervention: A decade later. Journal of Learning Disabilities, 45(3), 195–203.

Gersten, R., Compton, D., Connor, C.M., Dimino, J., Santoro, L., Linan-Thompson, S., and Tilly, W.D. (2008). Assisting students struggling with reading: Response to Intervention and multi-tier intervention for reading in the primary grades. A practice guide. (NCEE 2009-4045). Washington, DC: National Center for Edu­cation Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education. Retrieved from

January, S.A., & Klingbeil, D.A. (2020). Universal screening in grade K-2: A systematic review and meta-analysis of early reading curriculum-based measures. Journal of School Psychology, 82, 103-122.

Newell, K.W., Codding, R.S., & Fortune, T.W. (2020). Oral reading fluency as a screening tool with English learners: A systematic review. Psychology in the Schools, 57, 1208-1239.

Petscher, Y., Fien, H., Stanley, C., Gearin, B., Gaab, N., Fletcher, J.M., & Johnson, E. (2019). Screening for dyslexia. Washington, DC:  U.S. Department of Education, Office of Elementary and Secondary Education, Office of Special Education Programs, National Center on Improving Literacy. Retrieved from

Stecker, P. M., & Fuchs, L. S. (2000). Effecting superior achievement using curriculum-based measurement: The importance of individual progress monitoring. Learning Disabilities Research and Practice, 15, 128–134.

VanDerHeyden, A.M., Burns, M.K., & Bonifay, W. (2018). Is more screening better? The relationship between frequent screening, accurate decisions, and reading proficiency. School Psychology Review, 47(1), 62-82.


See what others have to say about this topic.

Jacquelyn Vegh Mar 05, 2022 05:56 PM

This is powerful information, but we are at the mercy of the demands made by our federal and state government. Our legislators will continue to require these tests as long as there are lobbyists from test companies with deep pockets. As long as we have mandated tests, schools will give a million other tests to prepare for the tests that count.

Lindsey Mar 05, 2022 06:07 PM

What are your thoughts on MAPS testing? I am in Texas. It seems to be taking over in our district and held up on a pedestal and teachers are being forced to use solely this info to drive their instruction. Thank you.

Angie Neal Mar 05, 2022 06:29 PM

Great post! I think there needs to be a definitive purpose to the screening/testing. Testing to determine reading level for leveled books - waste of time and not evidence-based. Testing to determine specific weaknesses when a student is struggling and then using that information to inform instruction- evidence-based and beneficial.

Sam Bommarito Mar 05, 2022 07:18 PM

Thanks for this timely information!

Margaret Mar 05, 2022 07:20 PM

I am a reading specialist in a small rural school. I’m often (not always) the only person that gives our Benchmark assessments. It takes a significant amount of time and then I work hard to see student weaknesses so that I may pull them into small group instruction. I have found that by doing these assessments, I am able to find weaknesses that the classroom teachers are not able to identify due to the number of students in the room. Sometimes the teacher has a “feeling” about student weaknesses and I am able to drill down and other times they are surprised by the results because the student was an excellent compensator or was a quiet rule follower who was able to skate by. Either way, I like to keep the groups semi-flexible and if a student seems to be responding to the intervention, I reassess and take them out of the group. Sometimes they end up coming back but not too often. So, though I agree that I spend a lot of time testing that I wish could be used on instruction, I feel that I’m working with the students that truly show weaknesses and it truly drives my intervention instruction.

Thank you for all that you do, Dr. Shanahan! I’m always learning more to improve instruction for our kids!

Linda Fenner Mar 05, 2022 07:44 PM

When I looked about a year ago for independent research (not studies sponsored by NWEA) on whether the MAP was correlated with increased student learning, I could find almost nothing. MAP dominates in our area and is used by some districts in teacher evaluations to demonstrate "growth." Dyslexia legislation has resulted in many districts adopting a canned program to meet the legislative requirement. During the pandemic, my grandson did not learn to read in first grade. The district (which is in another state) placed him with a highly skilled reading specialist who taught him for 30 minutes a day using a differentiated, personalized approach. This tutoring lasted for about a semester. She provided this instruction remotely, so I had the opportunity to watch my grandson's progress. He is in third grade reading quite well now.

Matt Mar 05, 2022 09:45 PM

Hi Tim,

Are you familiar with Sharon Walpole's work on differentiated instruction? I believe it attempts to diagnose specific unmastered foundational skills, teach those skills to the kids that require it, and then progress monitor. I'd be interested to hear your thoughts on her system of differentiation if you are at all familiar with her work.

Cheri Mar 05, 2022 10:35 PM

As an Instructional Coach in Elementary ELAR, I see a huge gap between testing and using the data to truly drive instruction. We really need to be doing our formative checks and planning from there, and leave the ginormous, soul-crushing, not-really-real-world summatives for the end of the year…or better yet, at the beginning, so we can use where the kids are then, instead of after summer slide!

Sebastian Wren Mar 05, 2022 11:27 PM

In our high-dosage tutoring program, we test all of the students once per week with a targeted, timed test. It takes one minute, and we only test the domain of reading that we are focused on with that student's instruction -- Letter-sound knowledge, basic decoding skills, reading fluency. Each student gets a test in the one aspect of reading we are targeting with instruction until they are doing well in that domain. Then we change instruction to a new domain and change our weekly assessment to match instruction. Without assessment, we could not target each student's needs. But we make sure that the assessment is quick and efficient.

I will never understand why so many schools give so many different, often redundant or inappropriate tests. A few quick tests will tell you most of what you need to know. And I will certainly never understand why we are starting to use computers to administer these tests. I've compared ISIP with our assessment data, and the computer is fairly reliable and valid with high-performing students, but it is very unreliable with students who are not yet good readers -- but of course those are the students we need the best information about.

Dr. William Conrad Mar 06, 2022 08:17 AM

Pandering to an assessment illiterate 1st grade teacher is not the answer, Tim. The teacher had no understanding of rhe assessment acronyms she was spouting.

Assessment is a funamental pillar of teaching and learning. Over 1/2 the colleges of education still fail to teach the science of reading. It is no wonder tat the 80% of white women teachers have difficuly with reading screening and monitoring assessments.

There are rwo essential elements to any assessment: rhe collection of assessment data and the accurate evaluation of the data. Clearly, our exteaordinarily weak rwching pool has issues with the latter assessment function.

Let’s promote more assessmeny literacy and less platitudes and pandering.

Timothy Shanahan Mar 06, 2022 02:25 PM


I think you might want to re-read my entry. The research shows how rarely teachers are prepared to use that test data -- but that doesn't stop testing mandates without appropriate supports or purposes. Thousands of schools have been mandated to provide such testing at a frequency that isn't justified by psychometric evidence and we still aren't seeing any achievement benefit. That's a tragedy. That first grade teacher is not the one who is deficient.


Timothy Shanahan Mar 06, 2022 07:48 PM


There are different sets of MAP tests. Some are more appropriate than others in my opinion. But even the most appropriate ones (the ones that don't try to give highly specific prescriptions for learning -- like focusing on single objectives on your state standards) can be given too often and can be used without sufficient support for teachers. The problem isn't the tests as much as the ways in which we use them. I've seen districts way over use MAP, to the point that I have advised they drop them altogether since they weren't providing the necessary supports to make this testing worthwhile.


Timothy Shanahan Mar 06, 2022 07:53 PM


I think highly of Sharon Walpole and her work. She is a dear friend and stays tightly aligned to research.


Curious Mar 07, 2022 02:14 AM

There's two pieces that I'm wondering more about here that I would really appreciate some feedback on.

I'll focus on one item here: the conversation around screening and progress monitoring sometimes feels overly-focused or constrained by a focus on specific probes like CBM.

I can appreciate the value here of a range of 1 min checks at determined times but also recognize that these serve as proxies that come with their own limits (sensitivity, specificity, false pos and neg, etc) and conditions (e.g.: repeated administrations to increase confidence). If I was not able to spend much time in a classroom, I'd use these for sure. (e.g.: a researcher).

But, if I am able to spend a year with students as a teacher, then why not just deliberately inventory the specified foundational skills that are deemed important here over a reasonable period (for screening)? And then report out on these at specific intervals (for progress monitoring).

Am I missing something here?

(Outside of CBM probes sometimes being required as a matter of policy or procedure in some spaces - and therefore not negotiable).

Kim Mar 06, 2022 10:24 PM

Hi Tim,
Great post. I am a K-3 special educator in a suburban targeted Title I school. We are currently in the latter stages of a curriculum review that has been narrowed down to two options. Unfortunately, neither option is aligned closely enough with the necessary components of SOR which is already sparking conversation of needed a supplementary phonemic program to improve what is currently lacking in our Tier 1 instruction. The teachers are not training in those key components. My hope is that if a supplementary phonemic program is chosen, it will start to build those skills for our teachers that they were not trained in or provided PD for SOR components.
In specific response to your post, an injustice that I see occuring in our district is as a result of the screenings, which occur three times a year, students that fall below benchmark are placed in "intervention" groups which may be led by a Title I assistant or instructional assistant that is not trained and the students that are on or above benchmark end up remaining in groups with the teacher that at least has been through a teacher training program. It doesn't make any sense. So when I read in your post that some schools did not make gains when using this approach it doesn't surprise me. I continue to advocate for placing only our most struggling students with reading specialists for intervention groups and allow the assistants to only push in to support work in the whole group setting. This is a flawed system and I know this happens across many schools/districts, not just my districts. This only ends up causing more referrals for evaluations when students haven't truly had effective interventions. Special education numbers just continue to increase. Thanks for all your great insight and work.

Timothy Shanahan Mar 07, 2022 02:04 PM

When one considers the number of skills (such as the number of spelling patterns) and the number of items one would need to reliably test student proficiency with each of those skills, it would take an inordinate amount of time to provide such testing. If you want to test how well kids do with beginning consonants, you would need several items to accomplish that. But if you wanted to test how well kids do with the initial P /p/, you'd need a similar number of items... and so on. Sampling is the way to go.


Heather Mar 07, 2022 05:22 PM

The fact that there are just too many informal reading assessments is a problem. Even if teachers had appropriate training in one assessment while in a teacher prep program... there is a very good chance that the school district they work at uses a totally different informal reading assessment. Not just that but the data gleaned from the assessments is not used appropriately, and it is detrimental because it is how they are deciding instruction for struggling students. I think reading assessments have a place, but there are too many. I also think you are absolutely correct when it comes to needing to test in a quiet place for the data to be accurate and reliable. It has nothing to do with just white woman having difficulty with reading screening, and monitoring. Too many people think they know, but they don't. Assessment tools could be better, and more streamlined.

Dr. Sarah Siegal Mar 09, 2022 04:14 PM

What a great post! I was really excited to see Dr. Connor's work highlighted here. I worked with Dr. Connor as a graduate student and post-doc at FSU, ASU, and UCI. I am now focused on helping translate her persuasive research into classrooms with Learning Ovations and can say that we've been having similar assessment conversations with some of our partner schools and teachers, so it's really helpful to have the discussion captured so clearly here.

I'd also like to encourage anyone who's interested in learning more about how Dr. Connor's work has informed how teachers are thinking about their instruction to check out or get in touch with us! Her work makes these testing challenges easy, and is allowing thousands of teachers to effectively use assessment to guide instruction and drive better outcomes!

Curious Mar 11, 2022 06:46 AM

Thanks for your response Tim.

I'm just trying to think through a bit of a decision tree so that assessments are both efficient and informative.

For students in about grade 2 and above, I'm intrigued by the idea of using the Words Their Way spelling inventory PLUS an ORF measure (CBM). The reason I wonder about the WTW spelling inventory here is partly because it's both informative and it can be administered to a class as a whole.


Or are there alternative measure's you'd suggest in terms of what to start with?

Timothy Shanahan Mar 11, 2022 02:19 PM


Yes, I find that inventory useful.


What Are your thoughts?

Leave me a comment and I would like to have a discussion with you!

Comment *

Do Screening and Monitoring Tests Really Help?


One of the world’s premier literacy educators.

He studies reading and writing across all ages and abilities. Feel free to contact him.