Should we grade students on the individual reading standards?

  • Common Core State Standards testing
  • 10 February, 2024
  • 32 Comments

 

Blast from the Past: This entry first posted on September 7, 2019, and reappeared on February 10, 2024. It seems to be that time of the year again. Principals are being encouraged by central administrations to put on the full court press for higher test scores this year. I know that, not because of Mardi Gras or Groundhog Day, but because I'm starting to get those questions from teachers. Here is one that just came in: "Dr. Shanahan, you have stated that STANDARDIZED reading test items analysis is a 'fool's errand.' My district requires me to complete an items analysis of a standardized reading test to "narrow my instructional focus on specific skills and question types" -per my admin. You stated, 'It is not the question type. It is the text.' How can I convince my supervisors to move away from this practice? Please help." Almost 5 years separates these questions, and the answer to them has not changed a bit.

RELATED: I want my students to comprehend, am I teaching the wrong kind of strategies?

Teacher question:

What are your thoughts on standards-based grading in ELA which is used in many districts? For example, teachers may be required to assign a number 1-4 (4 being mastery) that indicates a student’s proficiency level on each ELA standard. Teachers need to provide evidence to document how they determined the level of mastery. Oftentimes tests are created with items that address particular standards. If students get those items correct, that is evidence of mastery. What do you recommend?

Shanahan response: 

Oh boy… this answer is going make me popular with your district administration!

The honest answer is that this kind of standards-based grading makes no sense at all.

It is simply impossible to reliably or meaningfully measure performance on the individual reading standards. Consequently, I would not encourage teachers to try to do that.

If you doubt me on this, contact your state department of education and ask them why the state reading test doesn’t provide such information.

Or better yet, see if you can get those administrators who are requiring this kind of testing and grading to make the call.

You (or they) will find out that there is a good reason for that omission, and it isn’t that the state education officers never thought of it themselves.

Better yet check with the agencies who designed the tests for your state. Call AIR, Educational Testing Service, or ACT, or the folks who designed PARCC and SBAC or any of the other alphabet soup of accountability monitoring devices.

What you’ll find out is that no one has been able to come up with a valid or reliable way of providing scores for individual reading comprehension “skills” or standards.

Those companies hired the best psychometricians in the world, and have collectively spent billions of dollars designing tests, and haven’t been able to do what your administration wants. And, if those guys can’t, why would you assume that Mrs. Smith in second grade can do it in her spare time?

Studies have repeatedly shown that standardized reading comprehension tests measure a single factor—not a list of skills represented by the various types of question asked.

What should you do instead?

Test kids’ ability to comprehend a text of a target readability level. For instance, in third grade you might test kids with passages at appropriate levels for each report card marking (475L, 600L, 725L, and 850L). What you want to know is whether kids could make sense of such texts through silent reading.

You can still ask questions about these passages based on the “skills” that seem to be represented in your standards—you just can’t score them that way.

What you want to know is whether kids can make sense of such texts with 75% comprehension.

In other words, it’s the passages and text levels that should be your focus, not the question types or individual standards.

If kids can read such passages successfully, they’ll be able to answer your questions. And, if they can’t, then you need to focus on increasing their ability to read such texts. That means teaching things like vocabulary, text structure, and cohesion and having the kids read sufficiently challenging texts—not practicing answering certain types of questions.

Sorry administrators, you’re sending teachers on a fool’s errand. One that will not lead to higher reading achievement. It will just result in misleading information for parents and kids and a waste of effort for your teachers.

References

ACT. (2006). Reading between the lines. Iowa City, IA : American College Testing.

Davis , F.B. (1944). Fundamental factors in comprehension in reading. Psychometrika , 9( 3), 185–197.

Kulesz, P. A., Francis, D. J., Barnes, M. A., & Fletcher, J. M. (2016). The influence of properties of the test and their interactions with reader characteristics on reading comprehension: An explanatory item response study. Journal of Educational Psychology, 108(8), 1078-1097. https://doi.org/10.1037/edu0000126

Muijselaar, M. M. L., Swart, N. M., Steenbeek-Planting, E., Droop, M., Verhoeven, L., & de Jong, P. F. (2017). The dimensions of reading comprehension in Dutch children: Is differentiation by text and question type necessary? Journal of Educational Psychology, 109(1), 70-83. https://doi.org/10.1037/edu0000120

Spearritt , D. (1972). Identification of subskills of reading comprehension by maximum likelihood factor analysis. Reading Research Quarterly, 8( 1), 92–111. https://doi.org/10.2307/746983

Thorndike, R. (1973). Reading as reasoning. Reading Research Quarterly, 9(2), 135-147. https://doi.org/10.2307/747131

 

To examine the comments and discussion that responded to the original posting, click here:

https://www.shanahanonliteracy.com/blog/should-we-grade-students-on-the-individual-reading-standards

 

LISTEN TO MORE: Shanahan On Literacy Podcast

Comments

See what others have to say about this topic.

Dr. Bill Conrad Feb 10, 2024 02:58 PM

Great advice Tim! As usual!

Assessment illiteracy abounds in education!

If administrators were master teachers, we would no longer have the problem of administrators engaging in very bad advice designed primarily to provide inappropriate shortcuts to improve student performance on standardized tests. Administrators perceive that these inappropriate shortcuts will improve test scores and protect their inflated salaries! They won’t and they do a disservice to children and families.

There is way too much self over service and loyalty over competence in education.

Read The Fog of Education!

Dr. Evangeline Aguirre Feb 10, 2024 03:06 PM

Grading students based on performance on standardized assessments is not a guarantee of fair and accurate evaluation of their literacy/academic performance. As a former ESL/Reading teacher, I had the firsthand experience of how this practice of malpractice continues to negatively affect the literacy skills of students, and sadly erode their enthusiasm for reading. Overall comprehension is what we want our students to be good at. Unfortunately, the “chaotic test prep season” posts stress on both students and teachers - trying to maximize review of test strategies. “What are the key words? Did you skim and scan the text?” among many other staple review questions.

I would like to grade my students’ abilities to read based on the depth of their comprehension as demonstrated in oral discussions, writing assessments, literacy reflections and project-based learning.

One needs to take a closer look at the culture standardized assessments built. It’s a competition on who gets more funding based on test scores. Who gets to be an “A” school?

How does the system of standardized assessments help ELLs? Even newcomers take the assessments - no knowledge of English at all. How many times have I witnessed these students sit in hours of assessments when they knew so well they knew nothing about? And YES, I raised questions about this in many occasions. How about the culturally-bound concepts they’re not familiar with? I remember when my newcomers sat on PSAT for a good 6 hours of testing. They came back telling me they slept the whole time. We could have read some great articles, engaged in relevant projects and many other more productive use of the 6 hours.

While a lot of educators go through the pressure of test prep and grading, I was privileged enough to work with administrators who allowed me to modify the grading procedure & develop assessments based on the unique and special needs of my students. The school got the grade for standardized assessment results.

This is a serious conversation that needs dire attention.

Jose Feb 10, 2024 03:38 PM

Tim, I’m a bit confused about your comment that states don’t provide the standards-level information because our state (MA) does tag each test question by standard. (Whether that’s a good idea or not, that’s another story.) To clarify, you’re saying they may provide it, but it’s not reliable? Or that they don’t provide it at all?

I know you’ve long argued against trying to disentangle the reading standards. But are you similarly arguing against standards-based grading, particularly at the secondary level? Maybe not necessarily one grade/rating of 1-4 (Not Meeting, Partially Meeting, Meeting, Exceeding) per standard, but clumping clusters of reading standards (Key Ideas and Details, Craft and Structure, Integration of Knowledge and Ideas) and writing standards, or even the reading and writing standards as a whole? I know that such approach would be flawed, but I don’t see how it is more flawed than the typical 100-point grading scale that tries to communicate all sorts of information into one number and therefore communicates nothing.

To put all this another way, if we shouldn’t use the standards as a basis for grading, what’s the “right” way to use them to inform our curriculum, assessment, and instruction?

To be clear, I’m not disagreeing with you or saying what many districts are doing isn’t misguided. I’m just trying to disentangle the issue of hyper-focusing on question types from the issue of standards-based vs traditional grading.

Timothy Shanahan Feb 10, 2024 03:48 PM

Jose--
That is stupid, but they don't provide scores on those... they don't tell you that a student or your school failed to succeed on standard #... What they are telling you is they ask questions based on the standards -- but they don't tell you how anybody does on any individual standard.

tim

Lauren Feb 10, 2024 04:22 PM

I won't name any names of tests...but our district has online screening/progress monitoring assessments administered three times per year. One is for specific phonics skills and some reading fluency and accuracy, and one is for comprehension, IRL, and lexile numbers etc. There is talk about adding another one which claims to separate out individual reading standards to a greater degree. The most enjoyable thing about this new assessment is that the color coding representing levels of achievement (well below, below, benchmark etc) is the opposite of the other tests. Red is high instead of low, and green is bad instead of good. You can't help but feel that it would make a good comedy routine... The only thing that is not funny is how harmful this is for children. Of course some assessment in education is necessary, but aren't we going too far with the mountains of color-coded data?

Danna Sermersheim Feb 10, 2024 05:24 PM

Thank you, Tim. Currently in Indiana’s state testing reporting system, each student earns an “achievement category” description of either below, at/near, or above in two standards: “Key Ideas and Textual Support/Vocabulary” and “Structural Elements and Organization/Connection of Ideas/Media Literacy”.

In 2025-2026 Indiana will change their state testing to include “Checkpoints” 3 times a year with a final summative at the end of the year. During the checkpoints students will be assessed on certain ELA standards the state has deemed as “essential” or “standard” levels of priority. For example, at the first checkpoint 4th graders will be assessed on standards 4.C.1, 4.C.2, 4.C.3, and 4.C.10. Then at checkpoint two, 4.C.1 will be assessed again along with 4 new standards. Reporting categories have been established as Reading and Understanding Fiction; Reading and Understanding Informational Text and Media; Understanding and Using Vocabulary; and Writing. The goals of these checkpoints is to provide year-round information for instructional response and to highlight mastery of the standards at the time of learning while providing 100% alignment to the standards. What are your thought on this approach and reporting system?

Timothy Shanahan Feb 10, 2024 06:07 PM

Danna- They are going to find the same thing that everyone else has. Comprehension questions are so highly correlated with each other that any claims made for one will be made for the others (for the most part-- there is always a bit of error in testing, too).

tim

Jennifer Borgioli Binis Feb 10, 2024 09:24 PM

To be sure, it's not the same in every state, but several states do, actually provide reports down to the item level and how well students do on individual items that are aligned to specific standards. It's possible to generate reports that provide p-values by the item, by the subskill and by the standard - at least in NY. In other words, it is possible for a district leader in the state to get a sense of how students are doing on specific state standards. Some examples are here: https://dwdataview.wnyric.org/report-guides/

Timothy Shanahan Feb 10, 2024 09:46 PM

Jennifer--

Thanks for that. That information that you describe is just error and noise. Totally meaningless.

tim

Jennifer Borgioli Binis Feb 10, 2024 09:56 PM

I appreciate your sentiment and frustration. And that, to you, it's error and noise. However, there are lots of classroom and school-level educators across the state who invest a lot of time in thoughtful analysis of those reports from the state assessments and the implications for curriculum and instruction. While there are, for sure, valid arguments against doing that work, most of the people doing that work are in districts that have been told that if their scores fall, they risk closure. Or, if their scores aren't high enough, they need to change their entire reading program. It for sure seems that if SoR advocacy groups can prepare report cards comparing schools' performance on the tests, schools can at least consider such reports in their curriculum conversations. I'm not sure what's gained by suggesting they should just ignore the information as it's "meaningless."

Dr. Bill Conrad Feb 10, 2024 10:41 PM

You can never use one assessment item to make a decision about what a student knows or doesn’t know. The sample size is far too low and the standard error of measure is far too high! Using single items to make insructional decisions is malpractice. Don’t do it.

Sara Peden Feb 11, 2024 12:29 AM

I love almost every word of this. I often think that most educators don't have a great understanding of test reliability and validity. They have *no* consistent teaching about these concepts in my opinion. As a psychologist, I know that may *psychologists* who *do* have some training don't really grasp the implications of what they've learned about reliability and validity, so why would teachers have better understanding.

At the same time, I do have a problem with the recommendation you make after pointing out the futility of using standardized tests to report on individual comprehension standards. Do you really think we have a reliable way to determine what constitutes 75% comprehension? Are teacher made tests (which I think you're suggesting) reliable and valid measures of reading comprehension, as long as children can score 75% and the text readability/difficulty is held constant?

Whether a student could score 75% on any measure of comprehension would rely not only on the text difficulty (which we could reasonably hold somewhat constant) but on the difficulty of the questions making up the teacher's judgement of what belonged on their assessment, right?

I could create a reading comprehension test that almost every child who was able to decode the text independently could score near 100%. Alternatively, I could probably create a comprehension test that almost no child could score 75% on (unless they had a very specific repertoire of background knowledge). Obviously, the extremes aren't where any teachers would 'land' when creating questions to measure reading comprehension; but there's no reason, in my opinion, to believe that there would be much consistency among teachers in the ways they tried to measure comprehension of texts they selected based on readability/difficulty (lexiles) as you mention.

How could we reach a consensus about what consitutes reasonable test questions (which I think you're suggesting teachers should write), for any particular piece of text? What teacher made test is reliable to start with (or can be demonstrated to be so)? How do we know, without seeing the actual test, if a score of 75% would meet a reading comprehension standard? Wouldn't it be better to try to qualitatively describe what constitutes a reasonable level of comprehension at a particular grade, by using some exemplars (perhaps with commonly taught texts from that grade)?

Saying "75%" seems to introduce the exact problems of any other quantitative measure. You're asking teachers to quantify a child's reading comprehension with a test. If you're okay with quantifying 'reading comprehension' overall, why not just accept the overall measures from well constructed (standardized) and psychometrically sound tests of reading comprehension?

JD Durey Feb 11, 2024 12:57 AM

Very thought provoking. Am I understanding correctly that teachers should not be reporting on CCSS Rl and RL standards except for the text complexity?

Timothy Shanahan Feb 11, 2024 01:04 AM

JD—
Overall assessments of comprehension are fine but research has found no meaning to scores based on students’ ability to answer particular types of questions.

Tim

Kim Feb 11, 2024 03:34 AM

Thank you for this post. Are you against standards based grading? Or just analyzing standards on outcome assessments? What would your report cards look like?

Heather Feb 11, 2024 02:33 PM

I am in the same boat as Jose and Kim. I hear that pulling apart skills is not effective. From your research what is the best way to score students in reading for report card focuses? Specifically at the high school level.

If we did tests and looked at the scores of 75% and above then there is the whole issue of students turning in the work teachers assign. Why turn in work like your theme analysis paper when you could pass on a test?

Ugh it’s so messy and I just want to help guide towards the most effective ways and appreciate your thoughts.

Timothy Shanahan Feb 11, 2024 05:35 PM

Heather--

Have students read grade level texts and engage them in activities like (a) writing summaries, analyses, critiques, and syntheses of such texts; (b) answering questions about what that text said, how that text worked, and what it means in terms of how it connects with other texts or ideas.

How do they do with such tasks? Are they improving?

tim

Timothy Shanahan Feb 11, 2024 05:39 PM

Kim-
If it is impossible to use assessment to determine how well students are accomplishing a particular standard, then it is unlikely that a teacher could evaluate this. The correlation among items is so high that you cannot say that a student is doing well with standard 1 but not with standard 2... that is just too fine a point. You can determine that a student is having trouble comprehending 9th grade text, you may be able to see some large differences between how well they read narrative and informational text (though quite often there isn't much difference there)... You definitely can evaluate the extent to which students are writing a quality argument or essay or summary, the degree to which students are mastering the vocabulary that you are teaching, and things like that.

tim

Timothy Shanahan Feb 11, 2024 05:42 PM

JD--

Yes... that doesn't mean that you cannot ask questions based on the standards to determine how well students read a particular text, only that their performances on those individual question types lack both the reliability and validity to say anything about how well they could answer questions of those types. I'll go along with the ACT findings based on data from ~500,000 high school juniors -- knowing how well kids answered particular questions told them nothing about those students' comprehension, but knowing which texts they did well with, told them a lot!

tim

lwj Feb 11, 2024 06:36 PM

Dr. Evangeline Aguirre, 30 year teacher in PA, exactly my experience as well, fwiw. Thanks

Timothy Daugherty Feb 12, 2024 01:12 AM

Tim, our district has been working on becoming a High Reliability School following Robert Marzano's Art and Science of Teaching. We have been grading standard based for three years, I agree just simply assessing a standard does not improve reading scores, but would you agree assessing standards help guide the teacher in how effective their instruction is? Also, doesn't standard based teaching help teachers focus on important skills/strategies? I am conflicted because I see your point if a student is not reading proficiently for their grade than a test with certain standard focused questions are going to tell the same thing. I feel standard based is only effective if the school or teacher uses the assessment information to develop their instruction.

Katelyn Feb 12, 2024 02:27 PM

Hi Tim,

Thank you for another thought provoking post. I am the Literacy Lead for a PreK-12 public school district that has begun the work of implementing Standards Based Learning over the last two years. Starting next school year, all Elementary teachers will be engaged in intensive training around the Science of Reading. I have taken LETRS training and have a pretty solid understanding that reading comprehension is much more complex than simply a skill or a list of skills since those skills don't necessarily transfer from one text to another.

One hurdle I am approaching as a Literacy Lead and struggling with is the idea of marrying the two concepts of Standards Based Learning and the Science of Reading, especially in regards to unconstrained skills such as reading comprehension. I can see how constrained skills such as phonics skills acquisition could be assessed and represented on a scale, but I struggle with the idea of creating separate scales for each reading comprehension standard. I have learned so much about how comprehension depends so much more on background knowledge and vocabulary.

I also struggle with the reliability of assessing comprehension using independent reading especially in the primary grades where they are not yet fluent readers. I have been learning a lot about how listening comprehension is a great way to get young readers exposed to the complex sentence structure and rich vocabulary that complex text offers (texts that they cannot decode yet for themselves). How could comprehension be assessed using listening comprehension?

Thank you for any guidance you can give me about these challenges!

Timothy Shanahan Feb 12, 2024 03:22 PM

Timothy--
Nope. That isn't what the problem is. For years, teachers and principals have assumed that comprehension skills were the abilities that allowed one to answer certain kinds of questions. We ask students to read a text or several texts and then we ask questions that get at main idea, inferencing, key details, drawing conclusions, and the like and then, based on which questions they got right or wrong we determine which skills the students need work with. The standards are really descriptions of those question types.
That doesn't work, because tons of research has shown that what affects those comprehension scores has pretty much nothing to do with what kinds of questions were used to elicit the answers. It is the texts that are determining how well the kids read. The text is made up of vocabulary, syntax, cohesive links, structural organization, graphic devices, literary devices, as well as content and its match to the readers' background knowledge, depth of content, and so on. You're going to push teachers to teach main idea -- but they will get that question wrong because the main idea was stated explicitly in a complicated sentence with two vocabulary words they didn't know. You're going to waste everyone's time teaching kids how to make general inferences when the student is having trouble connect the ideas across the text.
The question items on a standardized reading test are so connected to each other *by the text) that it is impossible to measure multiple skills or abilities based on such a test.

tim

Drew Feb 12, 2024 06:12 PM

So if one were in a standards based district that attempted to break down reading standards into fairly specific objectives, it sounds like you advocate to revise those report cards to one broad category of Reading Comprehension? And any evidence teachers collected would be reported beneath that broad category whether it be finding the main idea, making complex inferences or analyzing the effect of literary elements to name a few .

Or even possibly report Reading Comprehension differently using a Lexile score that would indicate the level of texts students should have success with?

Drew Feb 12, 2024 06:12 PM

So if one were in a standards based district that attempted to break down reading standards into fairly specific objectives, it sounds like you advocate to revise those report cards to one broad category of Reading Comprehension? And any evidence teachers collected would be reported beneath that broad category whether it be finding the main idea, making complex inferences or analyzing the effect of literary elements to name a few .

Or even possibly report Reading Comprehension differently using a Lexile score that would indicate the level of texts students should have success with?

Timothy Shanahan Feb 12, 2024 08:05 PM

Katelyn--
I wouldn't recommend using listening comprehension assessment in place of reading comprehension assessment. Here are a couple of relevant blog entries.

https://www.shanahanonliteracy.com/blog/does-a-listening-deficit-predict-a-reading-deficit

https://www.shanahanonliteracy.com/blog/what-does-listening-capacity-tell-us-about-reading

tim

Saily Feb 15, 2024 12:45 PM

Thanks for this! I have often been near tears trying to assign grades 1-4 to 30 + standards per student while not having any real data on some of them. Now I know why I felt so ridiculous.

Just to clarify, since as foolish as this errand may be, my paycheck won't arrive until my students have a standards-based report card submitted: Your recommendation would be to assess students at a variety of text levels (presumably for grades 2+, not K-1), and ask a variety of comprehension questions, average the scores, and assign that average for all comprehension related standards (perhaps separating out informational and literary texts). Did I get that right? Or would the text levels correspond to grades 1-4, and at the highest text level where the student is comprehending "75%" of the text, we would assign that score to all reading comprehension standards?

I know your aim is probably to reduce the silly grading schemes in the first place, but ultimately, I am trying to understand whether grading should be based on the highest text level at which a student comprehends, or their general ability to comprehend a range of text levels. I hope my question is clear.

Timothy Shanahan Feb 17, 2024 06:35 PM

Sally-

Indeed, you got that right. The problem isn't the questions, it is the attempts to try to interpret response patterns to particular types of questions. Knowing that students can read second grade texts and answer most of the questions, but that they struggle to do this with third grade text (these can be stated in Lexiles, book levels). It isn't useful to know that the student got a main idea question right and two key ideas questions wrong.

tim

Katy Feb 22, 2024 04:33 PM

How does this translate to the objectives/I can statements? When posting the standards in kid friendly language I feel like it takes the focus away from overall comprehension. How can I meet the requirement in a way that matches the desired outcome?

Timothy Shanahan Feb 22, 2024 08:31 PM

Katy--
I would not write such statements for the individual reading standards. Perhaps you can get some by focusing at the Key ideas and details, craft and structure, integration of knowledge and ideas, and text structure levels -- I can read a text and understand what the author said or implied. I can read a text and describe its structure... etc.

tim

Ann Feb 24, 2024 09:12 PM

This is a fascinating discussion. So much has always depended on the skill of the teacher to infer competence and to pull the student forward. Most school districts inundate teachers with data AND demand teachers adhere to a purchased SOR based curriculum. Changing the trajectory of learning for students who struggle is demanding and difficult. In classrooms where 80% of the children are displaying half a year’s growth for every year of schooling, how can teachers change outcomes using data?

Timothy Shanahan Feb 26, 2024 04:22 AM

Ann--
I know the idea of trying to target learning on individual needs and to use data, etc. is a big deal right now, and yet the research behind it is pretty limited. I'm not against trying to use data, but I would work in broad strokes first. Make sure the curriculum overall is strong and appropriate and that the materials being used support it, make sure the teachers are able to teach that curriculum reasonably well and in a sufficient amount, and then start to refocus instruction for kids to make sure the kids on the extremes are getting what they need too. Going the other way won't work.

tim

What Are your thoughts?

Leave me a comment and I would like to have a discussion with you!

Comment *
Name*
Email*
Website
Comments

Should we grade students on the individual reading standards?

32 comments

One of the world’s premier literacy educators.

He studies reading and writing across all ages and abilities. Feel free to contact him.