Gender bias bias

A widely publicised recent study allegedly found rampant gender bias in elementary school mathematics. The soundbite version of the study is unequivocal:

“Teachers consistently underrate girls’ math skills.” – New York University

“Teachers consistently underrated girls’ math skills, even when boys and girls behaved and performed in similar ways academically.” – PBS

“Perhaps most unsettling is the study’s finding that teachers perceive girls with nearly identical mathematical abilities—and identical behavioral profiles—to be significantly less able than their male counterparts.” – Quarts (retweeted by the MAA)

Judging by these quotes you might think that the study:

1. Measured the mathematical ability of children.
2. Measured their teachers’ perceptions of their mathematical ability.
3. Compared the above two results.
4. Found that teachers rate the ability of girls lower than their actual test performance.

But not so. The researchers did steps 1-3 alright, but they did not find 4. They found the opposite. What the data actually says is that teachers judge ability accurately, by and large. If anything, teachers tend to overestimate the ability of girls. For example, “at the very top of the distribution, teachers rate the math proficiency of girls higher than that of boys — a pattern that sharply contradicts the direct cognitive assessment pattern. That is, whereas the direct assessment finds that only about 33% of students at or above the 99th percentile are female, teachers rate girls to be over 60% of the top students.” (p. 10 of the study)

So an alternative soundbite version of the study could have been: “mathematical ability of girls overrated by teachers.” But good luck getting featured on PBS and retweeted by the MAA saying that. It may be what the data says, but that’s irrelevant. People don’t want to hear that conclusion. Everybody knows that anyone who is “progressive” in education is fighting against rampant gender bias. And if you want to get anywhere in academia (and certainly if you want a sweet faculty gig at New York University like the lead author of the study) then you better toe the “progressive” party line.

The purpose of educational “research” is to reach ideologically desirable conclusions. The purpose of educational “research” is to tell people what they have already decided they want the answer to be. Antiquated notions like critical thought and questioning preconceptions have no place in “progressive” educational “research.”

This being so, the authors of the study of course found themselves in a pickle when the data clearly contradicted The Only Acceptable Truth. How annoying that teachers didn’t play ball and show that clear gender bias that the researchers needed to find in order to conform to the ideologically predetermined outcome of the study!

But the researchers found a clever way of getting around this problem. They also asked the teachers about the behaviour of the students, and naturally the teachers rated girls as better behaved than boys. So, by doing some statistical trickery to shift the baseline accordingly, the researchers managed to make the desired gender bias appear out of nowhere by comparing teachers’ assessment of ability not with actual test scores but with test scores after behaviour scores had been subtracted or factored out.

This puts the sensationalist quotes above in a whole new light. What they seemed to be saying was that girls were consistently underestimated even when they performed at the same level, and furthermore even when they displayed the same behaviour. We are supposed to react: Unbelievable! What more do these girls need to do to be respected and taken seriously?!

But the reality if very different. It’s not: gender bias is rampant, EVEN when girls behave the same way and everything. Rather it’s: there is NO bias against girls at all, UNLESS a dubious “behaviour” score is used to shift the baseline. The behaviour condition does not make the claim of gender bias stronger; it makes it much, much weaker, not to say altogether unsustainable.

Indeed, the problem with this baseline shift is blatantly obvious: How do we know that teacher assessments of behaviour themselves are not massively biased? After all, isn’t “good behaviour” a more nebulous concept than mathematical ability as measured by standard tests? Indeed the researchers had no other measure of “behaviour”: they only had the teachers’ opinions and no way whatsoever to check it objectively. And since the teachers were, if anything, biased in favour of girls on ability, could they not also be biased in favour of girls on behaviour? This could in fact explain the entire effect detected, so that, ironically, the alleged research finding of bias against girls could just as well be due to a bias in favour of girls.

The authors themselves in fact say exactly this in a parenthetical “caveat”: “One caveat to consider is that teachers’ ratings of student behavior might be biased by student gender. For example, if teachers rate girls’ behavior as better than that of equally behaving boys, then this bias would contribute to the gender gap we see in teacher ratings of girls and boys as well as to our findings regarding the underrating of ‘equally’ behaving and equally performing girls and boys. ... The possibility of biased ratings of behavior suggests that caution is warranted in interpreting results.” (pp. 13-14)

But this little “caveat” is buried on page 13 of the actual, peer-reviewed article, and the called-for “caution” is of course nowhere to be found in the sensationalist drivel being peddled by journalists and on Twitter because it caters to the reigning ideology. Which is why when people say “research shows...” one must understand that this means “because of my preconceived beliefs I refuse to consider any other possibility than...”