Should we grade students on the individual reading standards?

Blast from the Past: This entry first posted on September 7, 2019, and reappeared on February 10, 2024. It seems to be that time of the year again. Principals are being encouraged by central administrations to put on the full court press for higher test scores this year. I know that, not because of Mardi Gras or Groundhog Day, but because I'm starting to get those questions from teachers. Here is one that just came in: "Dr. Shanahan, you have stated that STANDARDIZED reading test items analysis is a 'fool's errand.' My district requires me to complete an items analysis of a standardized reading test to "narrow my instructional focus on specific skills and question types" -per my admin. You stated, 'It is not the question type. It is the text.' How can I convince my supervisors to move away from this practice? Please help." Almost 5 years separates these questions, and the answer to them has not changed a bit.

RELATED: I want my students to comprehend, am I teaching the wrong kind of strategies?

Teacher question:

What are your thoughts on standards-based grading in ELA which is used in many districts? For example, teachers may be required to assign a number 1-4 (4 being mastery) that indicates a student’s proficiency level on each ELA standard. Teachers need to provide evidence to document how they determined the level of mastery. Oftentimes tests are created with items that address particular standards. If students get those items correct, that is evidence of mastery. What do you recommend?

Shanahan response:

Oh boy… this answer is going make me popular with your district administration!

The honest answer is that this kind of standards-based grading makes no sense at all.

It is simply impossible to reliably or meaningfully measure performance on the individual reading standards. Consequently, I would not encourage teachers to try to do that.

If you doubt me on this, contact your state department of education and ask them why the state reading test doesn’t provide such information.

Or better yet, see if you can get those administrators who are requiring this kind of testing and grading to make the call.

You (or they) will find out that there is a good reason for that omission, and it isn’t that the state education officers never thought of it themselves.

Better yet check with the agencies who designed the tests for your state. Call AIR, Educational Testing Service, or ACT, or the folks who designed PARCC and SBAC or any of the other alphabet soup of accountability monitoring devices.

What you’ll find out is that no one has been able to come up with a valid or reliable way of providing scores for individual reading comprehension “skills” or standards.

Those companies hired the best psychometricians in the world, and have collectively spent billions of dollars designing tests, and haven’t been able to do what your administration wants. And, if those guys can’t, why would you assume that Mrs. Smith in second grade can do it in her spare time?

Studies have repeatedly shown that standardized reading comprehension tests measure a single factor—not a list of skills represented by the various types of question asked.

What should you do instead?

Test kids’ ability to comprehend a text of a target readability level. For instance, in third grade you might test kids with passages at appropriate levels for each report card marking (475L, 600L, 725L, and 850L). What you want to know is whether kids could make sense of such texts through silent reading.

You can still ask questions about these passages based on the “skills” that seem to be represented in your standards—you just can’t score them that way.

What you want to know is whether kids can make sense of such texts with 75% comprehension.

In other words, it’s the passages and text levels that should be your focus, not the question types or individual standards.

If kids can read such passages successfully, they’ll be able to answer your questions. And, if they can’t, then you need to focus on increasing their ability to read such texts. That means teaching things like vocabulary, text structure, and cohesion and having the kids read sufficiently challenging texts—not practicing answering certain types of questions.

Sorry administrators, you’re sending teachers on a fool’s errand. One that will not lead to higher reading achievement. It will just result in misleading information for parents and kids and a waste of effort for your teachers.

References

ACT. (2006). Reading between the lines. Iowa City, IA : American College Testing.

Davis , F.B. (1944). Fundamental factors in comprehension in reading. Psychometrika , 9( 3), 185–197.

Kulesz, P. A., Francis, D. J., Barnes, M. A., & Fletcher, J. M. (2016). The influence of properties of the test and their interactions with reader characteristics on reading comprehension: An explanatory item response study. Journal of Educational Psychology, 108(8), 1078-1097. https://doi.org/10.1037/edu0000126

Muijselaar, M. M. L., Swart, N. M., Steenbeek-Planting, E., Droop, M., Verhoeven, L., & de Jong, P. F. (2017). The dimensions of reading comprehension in Dutch children: Is differentiation by text and question type necessary? Journal of Educational Psychology, 109(1), 70-83. https://doi.org/10.1037/edu0000120

Spearritt , D. (1972). Identification of subskills of reading comprehension by maximum likelihood factor analysis. Reading Research Quarterly, 8( 1), 92–111. https://doi.org/10.2307/746983

Thorndike, R. (1973). Reading as reasoning. Reading Research Quarterly, 9(2), 135-147. https://doi.org/10.2307/747131

To examine the comments and discussion that responded to the original posting, click here:

https://www.shanahanonliteracy.com/blog/should-we-grade-students-on-the-individual-reading-standards

LISTEN TO MORE: Shanahan On Literacy Podcast

Comments

See what others have to say about this topic.

What Are your thoughts?

Leave me a comment and I would like to have a discussion with you!

Dr. Bill Conrad Feb 10, 2024 02:58 PM

Great advice Tim! As usual!

Assessment illiteracy abounds in education!

If administrators were master teachers, we would no longer have the problem of administrators engaging in very bad advice designed primarily to provide inappropriate shortcuts to improve student performance on standardized tests. Administrators perceive that these inappropriate shortcuts will improve test scores and protect their inflated salaries! They won’t and they do a disservice to children and families.

There is way too much self over service and loyalty over competence in education.

Read The Fog of Education!

Dr. Evangeline Aguirre Feb 10, 2024 03:06 PM

Grading students based on performance on standardized assessments is not a guarantee of fair and accurate evaluation of their literacy/academic performance. As a former ESL/Reading teacher, I had the firsthand experience of how this practice of malpractice continues to negatively affect the literacy skills of students, and sadly erode their enthusiasm for reading. Overall comprehension is what we want our students to be good at. Unfortunately, the “chaotic test prep season” posts stress on both students and teachers - trying to maximize review of test strategies. “What are the key words? Did you skim and scan the text?” among many other staple review questions.

I would like to grade my students’ abilities to read based on the depth of their comprehension as demonstrated in oral discussions, writing assessments, literacy reflections and project-based learning.

One needs to take a closer look at the culture standardized assessments built. It’s a competition on who gets more funding based on test scores. Who gets to be an “A” school?

How does the system of standardized assessments help ELLs? Even newcomers take the assessments - no knowledge of English at all. How many times have I witnessed these students sit in hours of assessments when they knew so well they knew nothing about? And YES, I raised questions about this in many occasions. How about the culturally-bound concepts they’re not familiar with? I remember when my newcomers sat on PSAT for a good 6 hours of testing. They came back telling me they slept the whole time. We could have read some great articles, engaged in relevant projects and many other more productive use of the 6 hours.

While a lot of educators go through the pressure of test prep and grading, I was privileged enough to work with administrators who allowed me to modify the grading procedure & develop assessments based on the unique and special needs of my students. The school got the grade for standardized assessment results.

This is a serious conversation that needs dire attention.

Jose Feb 10, 2024 03:38 PM

Tim, I’m a bit confused about your comment that states don’t provide the standards-level information because our state (MA) does tag each test question by standard. (Whether that’s a good idea or not, that’s another story.) To clarify, you’re saying they may provide it, but it’s not reliable? Or that they don’t provide it at all?

I know you’ve long argued against trying to disentangle the reading standards. But are you similarly arguing against standards-based grading, particularly at the secondary level? Maybe not necessarily one grade/rating of 1-4 (Not Meeting, Partially Meeting, Meeting, Exceeding) per standard, but clumping clusters of reading standards (Key Ideas and Details, Craft and Structure, Integration of Knowledge and Ideas) and writing standards, or even the reading and writing standards as a whole? I know that such approach would be flawed, but I don’t see how it is more flawed than the typical 100-point grading scale that tries to communicate all sorts of information into one number and therefore communicates nothing.

To put all this another way, if we shouldn’t use the standards as a basis for grading, what’s the “right” way to use them to inform our curriculum, assessment, and instruction?

To be clear, I’m not disagreeing with you or saying what many districts are doing isn’t misguided. I’m just trying to disentangle the issue of hyper-focusing on question types from the issue of standards-based vs traditional grading.

Timothy Shanahan Feb 10, 2024 03:48 PM

Jose--
That is stupid, but they don't provide scores on those... they don't tell you that a student or your school failed to succeed on standard #... What they are telling you is they ask questions based on the standards -- but they don't tell you how anybody does on any individual standard.

tim

Sara Peden Feb 11, 2024 12:29 AM

I love almost every word of this. I often think that most educators don't have a great understanding of test reliability and validity. They have *no* consistent teaching about these concepts in my opinion. As a psychologist, I know that may *psychologists* who *do* have some training don't really grasp the implications of what they've learned about reliability and validity, so why would teachers have better understanding.

At the same time, I do have a problem with the recommendation you make after pointing out the futility of using standardized tests to report on individual comprehension standards. Do you really think we have a reliable way to determine what constitutes 75% comprehension? Are teacher made tests (which I think you're suggesting) reliable and valid measures of reading comprehension, as long as children can score 75% and the text readability/difficulty is held constant?

Whether a student could score 75% on any measure of comprehension would rely not only on the text difficulty (which we could reasonably hold somewhat constant) but on the difficulty of the questions making up the teacher's judgement of what belonged on their assessment, right?

I could create a reading comprehension test that almost every child who was able to decode the text independently could score near 100%. Alternatively, I could probably create a comprehension test that almost no child could score 75% on (unless they had a very specific repertoire of background knowledge). Obviously, the extremes aren't where any teachers would 'land' when creating questions to measure reading comprehension; but there's no reason, in my opinion, to believe that there would be much consistency among teachers in the ways they tried to measure comprehension of texts they selected based on readability/difficulty (lexiles) as you mention.

How could we reach a consensus about what consitutes reasonable test questions (which I think you're suggesting teachers should write), for any particular piece of text? What teacher made test is reliable to start with (or can be demonstrated to be so)? How do we know, without seeing the actual test, if a score of 75% would meet a reading comprehension standard? Wouldn't it be better to try to qualitatively describe what constitutes a reasonable level of comprehension at a particular grade, by using some exemplars (perhaps with commonly taught texts from that grade)?

Saying "75%" seems to introduce the exact problems of any other quantitative measure. You're asking teachers to quantify a child's reading comprehension with a test. If you're okay with quantifying 'reading comprehension' overall, why not just accept the overall measures from well constructed (standardized) and psychometrically sound tests of reading comprehension?