My Students Test Scores Went Up!
And my reasonable reaction to this news
Some years ago I sat in on a lecture by the great education reform advocate, Alfie Kohn.¹ During the lecture, Mr. Kohn stated that there are two reasonable responses to the news that test scores have increased in one’s district. The first² he referred to as the moderate response: “So what?” According to Kohn, and the prevailing research, there is no empirical evidence that test scores are a reliable measure of “learning.” In fact, he pointed out research that revealed the opposite influence; those who scored the highest on standardized tests tended to be the shallowest thinkers and least actively involved students. He also pointed out that standardized tests have virtually no predictive value for future academic success. In real terms, there is no reason to take pride in higher test scores.
Yet there is considerable social pressure on teachers to do just that. This last year, when it was improved test scores for my school were announced, there was a resounding applause among the faculty. We were recognized for “working hard to raise our scores.” It was difficult not to get caught up in the exhilaration. Indeed, we did work hard. Much of our efforts were focused on raising those test scores. Teaching methodologies, such as the Kagan method, were implemented, sold to us based in part because they are proven to raise test scores. During two Kagan trainings, the trainer asked us about our goals as teachers. One question was, “do you want your student’s test scores to go up.” There was a resounding “yes” and even applause from the faculty. I somewhat dismayed by how effectively socialized teachers were in understanding rising test scores as a reasonable goal.³
To be honest, when I look at my students’ test scores, I must admit to a certain visceral excitement when I see their “progress.” Then reason takes hold, as does my crippling sociological imagination. Are the increases in my student’s test scores something to be proud of, or should my response be “so what?”
Yes, my students, taken as a whole, showed higher overall scores on this year’s test as compared to last year’s test. I do not know if the difference between the two is statistically significant, as I’ve not taken the time to run the numbers. (Perhaps that will be a later post). I can say that, as I scanned the data, I observed that the vast majority of students showed little significant improvement despite an overall positive change in their raw scores. According to one source, an increase in 78 points is the equivalent of a year’s learning. Based on this assumption, every student should score around 78 points higher than the previous test. In fact, many of my students did just that. However, was that the result of my awesomeness as a teacher? I’d like to say it was, but the facts might be a little more slippery.
First, what caught my eye were the students who scored well above 78 points. Was I such a good teacher that I over-taught the year? And if that’s true, why didn’t all of my students score more than 78 points? Or could there have been other variables, independent of my quality as a teacher?
Let’s look at one outlying student, I’ll call her Sally. Sally scored over 700 points higher on the reading portion of her FCAT this year than she did last year. Seven hundred points! Yay me! Hey, since I, as a teacher, must take the blame for all the failures of public schools, I should be able to take credit for one profound success. Shouldn’t I? Well, as much as I would like to, I’m afraid there are other factors that I would have to control for before I can understand just how much value added I contributed to this child’s amazing test results.
- Perhaps the stars aligned. In other words, the student’s results might be due to nothing more significant than luck. Of course, as a sociologist, I’m not inclined to think in terms of luck; but I can think in terms of probabilities. There is a probability that she just happened to guess well on the stuff she did not know. I tend to discount this hypothesis because it’s demoralizing, but also because there was more than one student whose increase was extraordinary, though hers was the largest.
- If this year’s test might be an outlier due to nothing more than probability, it’s also possible that last year’s test was a probabilistic low outlier. In other words, if she wasn’t lucky this year, she may have been unlucky last year. Evaluations of teachers and schools, however, are based on a comparison of the current year to the prior year, not a complicated evaluation of student trends.
- It could be that she learned seven years’ worth of material this year; thus, her test results for this year and last year are accurate representations of her learning. That would be wonderful. It’s almost certainly not true. Florida has a pretty robust curriculum set, especially at the high school level. There’s very little time for “re-teaching” much from the year before, let alone from the previous six years. So, if Sally did learn six years’ curriculum this year, she must have done so on her own. More power to her. Can we say that this is the result of quality teaching? Perhaps inspiring teaching? It’s impossible to say.
- An option that I think is plausible is that she was able to learn a foundational skill, or a skill set, that helped her understand the test or the test questions better as a whole. How did this happen? It could be that one of her teachers taught these foundational skills in a manner consistent with her learning modalities. But which teacher was it? It probably wasn’t me because I’m not a reading or language teacher. On the other hand, it could have been me since reading is a big part of my class. Or it could be that one teacher, or a combination of teachers taught the skill, and I reinforced it. So how much credit should I get? How much “value” did I add? On the other hand, she might have learned the requisite skills from a previous teacher, but didn’t actually “get” or understand the skills until this year when it finally “clicked”. In this case, the right teacher will not get credit for Sally’s higher test scores.
- Related to option 4 is the possibility that Sally acquired or mastered the foundational skill due to neurological development that occurred during this year. She may have been exposed to this hypothetical skill every year for the last five years, but did not possess the neurological “wiring” to assimilate it into her learning until this year. This is an often ignored aspect of learning. In this case, there is no way to assign credit for Sally’s own neurological development.
- It could be that this year was the first in which she cared about the FCAT. In Florida, the tenth grade FCAT determines graduation. Understanding the importance of this test over previous exams may have compelled her to try harder.
- Conditions of overall health and well-being may have influenced her test performance. It could be that this was the first year in which Sally received adequate nutrition, sleep, or exercise. Perhaps her home life was finally stable. Maybe her social status in the school was such that she developed a positive self-image. Again, these are not variables within the control of the teacher.
- I hate to say it, but she could have been under the influence of performance enhancing drugs, legal or illegal, that helped her concentrate or increased her thinking capacity. Many stimulant drugs from Ritalin to cocaine have this effect.
Any of the above variables, and perhaps more, could have been deciding factors in her impressive testing gains. Alas, I can’t reasonably take credit for any of them definitively. Yet I’m expected to take great pride in her results and the cumulative results of all of my students for working hard to increase those test scores. Unfortunately, there’s almost no substantive reason for this pride.
On the other end of the spectrum, were there students who scored inordinately lower this year than last? The answer is yes, but fortunately there were fewer of them than there were Sallies. Let’s focus on one such student, whom we’ll call Phil. Phil scored over five hundred points lower on this year’s reading FCAT than last year’s. Why? Was I such a bad teacher that I, somehow, untaught Phil six years’ worth of learning? I find this hard to believe. What happened?
Well, all of the variables attributed to Sally could be true in reverse for Phil. Perhaps he had a bout of bad luck. It could be that this year’s test score was an outlier on the negative end of the continuum, or that last year’s was an outlier year on the positive end and this year’s scores were consistent with his overall trend. Maybe Phil just didn’t care about this test enough to try hard. After all, the tenth grade FCAT can be retaken and passed at any time between 10th and 12th grades. It could be that Phil knows that he’s not going to graduate anyway, for whatever reason, or that the FCAT is irrelevant to Phil’s educational or personal goals. Perhaps he knows that he’s moving out of state where the FCAT does not matter.
During this time of economic instability we cannot rule out the possibility that Phil’s test scores reflect a sudden destabilizing of his life. His parent/parents may be out of work, facing foreclosure, arguing over money. Maybe they cannot afford to provide proper nutrition or other health sustaining resources to Phil. It could very well be that Phil had to find a job to help provide for his family. Even the knowledge of reduced economic prospects may have squashed his personal investment in the future, or in school. My school emphasizes college, college, college. If Phil knows that he is not likely to go to college, why should he participate?
Instability does not have to be economic. His family may be unraveling for other reasons. Phil may be facing the prospects of parental conflict or divorce. New members may have entered his household, parental boyfriends or girlfriends. Might these relationships be unhealthy, destabilizing or even abusive?
Phil may have also fallen under the influence of drugs or other unhealthy influences that usurp the focus and goals of young people. This is not unusual for fifteen or sixteen year olds. Even typical teenage relationships with peer groups or romantic interests can interfere with one’s academic prospects. Has Phil changed schools, been uprooted from a comfortable environment?
Then there’s the matter of neurological development. Now neurology does not, typically un-develop in youth, but that does not rule out the prospect of neurological damage caused by falls, toxification or drug use. In Phil’s defense, I do not believe he suffered from brain damage and it is my hope that he has not fallen into a drug habit or addiction (again, I don’t believe he has).
Regardless, there are so many possibilities that might have negatively influenced Phil that are not a reflection of my skills, or the skills of my colleagues. Yet some would penalize teachers for Phil’s poor performance.
One student who was of particular interest to me was a fellow whom we can call Tony. Throughout Tony’s career he has scored high on the FCAT. FCAT is scored between 1 and 5. Tony consistently scored in the high 4’s, last year scoring a 5 on the reading portion. I paid special attention to Tony this year because, no matter what I did, his classwork and assessments never matched his FCAT performance. Tony was bright. If I asked him a question he invariably had an adequate answer. However, he rarely did any of his assignments, and when he did turn something in it was sloppy, poorly written, not done according to directions and often incomplete. I spoke to his mother. I spoke to his other teachers. We had parent teacher conferences. None of us could understand how he could do so well on the FCAT, yet do so poorly in his classes.
This year, Tony resolved the contradiction. He scored a 2 on the FCAT Reading test. How much of this score was strategic on Tony’s part? It’s impossible to say. Which was the true measure of Tony’s abilities? I would hazard, based on observation, that his historic FCAT scores were the most accurate assessment of Tony’s potential. I offer that his class performance was rather more indicative of his work ethic. Of course, there may be some underlying problems that I have not identified.
Regardless, student performance is subject to so many variables, internal or external, and so many ecological pressures that it is impossible to impute the teachers’ skills and influence in a single test score. Yes, there are some statistical procedures that can be performed to parse out a correlation that might indicate positive or negative influence, but they are unlikely to provide significant results when so many variables must be controlled.
Kohn is correct. “So what,” is the best response to the news that your student’s test scores have gone up. Teachers should not be working hard “to raise test scores.” Any professional teacher worthy of that title should be working hard to provide the best, most fulfilling, most inspiring education for their students. They should be working hard to prepare their students for a lifetime of learning and growth outside of the classroom and beyond the school building. Such is the laudable life mission of teachers. A mundane bureaucratic endeavor that reduces “education” to a number between 1 and 5 is not worthy of a teacher’s attentions, and certainly nothing for which a teacher should take pride.
I have never been proud of a test score. I’ve been relieved by test scores, and disappointed by test scores, but never proud. What makes me proud, and what is a true assessment of the value added in the life of a student? It’s when former students reach out to me and say, “Thank you, Mr. Andoscia.” I have many former students now, building their lives and contributing to society. I don’t remember any of their test scores. Nor should I.
- The lecture was to promote his book, The Schools Our Children Deserve: Moving Beyond Traditional Classrooms and “Tougher Standards.”
- The second reasonable response, according to Kohn, was the more radical, “My dear God, no! What have you denied my child in order to get those higher test scores?” I agree with this more radical statement, but it is beyond the parameters of this essay. I could offer that when teachers are focusing on test scores, there is the possibility that they are not focusing on the holistic value of a child’s education. When teachers take pride in their test scores, they may be magnifying the false value of a purely bureaucratic endeavor at the expense of a truer value in the humanity of their students. To so “deny” a child in such a manner is not consistent with a teacher’s ethics.
- I am very fortunate to work for a principal who truly believes that providing the best possible education is the best strategy for raising test scores. He has mastered a fine-tuned balance between the virtues of quality teaching and the practical reality that our school is judged based on test scores. In my school, teachers are treated as professionals and given every opportunity to better themselves. Such is the cross borne by quality principals and administrators everywhere. The Kagan method is a valuable tool for teachers and can be presented as such. That the presenter felt the need to justify the Kagan method by using test scores is a critique of our current state of education, not the Kagan method itself.