“Only in education,” part three.

Once again I’ve been drafted to participate in what I call the “Not A Thing” seminar. It involves missing five days of school (today being the second) in the hopes of learning how to make the rest of the faculty (i.e., the part actually getting stuff accomplished because they’re at school instead of in the seminar) more professional, more collaborative, and more focused on student learning.  Click here for my post on the first time I went through it, and click here for my second post on N.A.T.

As expected, I got almost nothing new out of it. I did get some planning done, had an opportunity to speak with the principal, saw a couple of my old bosses, and enjoyed a nice roast beef sandwich with swiss on rye.

Anyhow, on to the griping. Early in the day, the presenter said (I’m paraphrasing slightly), “Common assessments should never be used for teacher evaluation.” What, you may ask, is a “common assessment”? It’s an assessment (test, quiz, essay, project, etc.) given to all students in a given subject at a given school, or in a district, or statewide, or nationally.

Say Mr. Alpha, Mr. Beta and Mr. Gamma gave exactly the same test to their students on the same day. That would be a common assessment. Say Alpha’s students averaged 90% on the test, Beta’s students averaged 70% on the test, and Gamma’s students averaged 50% on the test. It seems safe to say that Alpha’s students were better prepared for the test than Beta’s students, who in turn were better prepared than Gamma’s. Why the discrepancy?

There could be several reasons for the discrepancy. Maybe Alpha has the strongest students and Gamma has the weakest. Maybe Alpha, Beta and Gamma didn’t have the same amount of time or the same type of materials to prepare their students for the test. Maybe there were distractions in Gamma’s room that didn’t occur in Alpha’s room. Maybe Beta’s students were all having bad days, and Gamma’s kids were having worse days. Or, perhaps, Alpha taught the content and skills better than Beta, who in turn taught it better than Gamma. Perhaps Alpha is just a better teacher than the other two. Why on Earth would we preclude that possibility, or reject the idea of using common assessments to test that possibility?

Take the AP United States History (APUSH) exam, a common assessment created by the College Board. My students have had the highest pass rates at my school on that exam every year I’ve taught the class. I’d like to think that I’m one of the better APUSH teachers at my school, but my high pass rates are most likely a reflection of the fact that I have the strongest students at the school. But let’s assume the opposite: say my pass rate on the APUSH exam is 50% lower than the school average even though my students are 50% smarter than the average kid at my school. Why shouldn’t my boss use that data to reconsider my position at the school, or at least to keep a closer eye on me?

Judging and evaluating teachers is not easy or precise; after all, it’s not as if every teacher has a perfectly randomized sample of the student body in their classes so that teacher comparisons are statistically unbiased. And common assessments shouldn’t be the only, and perhaps not even the primary, tool for evaluating teachers. But if you put common assessments completely off-limits, as this presenter suggested, then you eliminate the most objective element of teacher evaluation. What are you left with?

There was also the usual nonsense about never giving zeroes, but I’m too tired to write about it right now. Suffice it to say that a grade of zero remains a distinct possibility in my classroom.

I’m watching V. It’s okay, but I miss Inara’s raven curls. The pixie look just doesn’t cut it.

2 thoughts on ““Only in education,” part three.

  1. I think you should blow the presenter’s mind by suggesting that a properly constructed ANOVA (Analysis of Variance, for the statistically uninitiated) test would tease out which effects were significant and which were not. And yes, teacher competence can be inferred if the statistical measures are properly constructed and controlled.


  2. I agree, common assessments shouldn’t be the primary tool but it should be one of the top 3 tools used for evaluation.

    A student’s result on a respected and legit common assessment like the AP exams should approximately match up to the grade that they received in class. If a teacher gives out a lot of A’s then those students should be receiving either 4s or 5s on the corresponding AP exam. If they don’t then there is something seriously wrong with the teacher and not the students. A teacher, especially at Paxon should not have more than 10 students (arbitrary number) receiving an A in the class who would received a 1 on the exam, just the thought of it is completely ludicrous; there is no way that everyone could be having a bad day. But, you must account for the outliers and just look at it as a general trend. So if a teacher has a lot of “A” students and a lot of 1s on the exams it’s bad. Whereas if a teacher has very few “A” students but, those students have earned 5s on the exams then it’s good.

    Ideally I think that the exam grade and the grade in class should match up perfectly but then a majority of students would then fail both the exam and the class then parents will complain and grade inflation’ll get worse etc. etc.

    Here are a few good topics for the seminar:
    stop curving everything,
    reversing the grade inflation trend that has been increasing since the Vietnam War


Comments are closed.