Some critiques of mathematics education research articles

Research in mathematics education is generally hopelessly unintelligent. The field as a whole is not far from worthless. Below are some critiques of research articles demonstrating this in specific instances. These articles are very much representative of the quality of the field as a whole. I have not sought them out because they were especially easy to poke holes in. On the contrary, these articles were all assigned reading in mathematics education graduate courses that I took.

For further critiques establishing the same point, see my mathematics education book reviews and Can There be “Research in Mathematical Education”? by Herbert S. Wilf.


Catsambis, S., Mulkey, L. M., & Crain, R. L., (2001). For better or for worse? a nationwide study of the social psychological effects of gender and ability grouping in mathematics. Social Psychology of Education, 5: 83–115.

Mulkey, L. M., Catsambis, S., Steelman, L. C., & Crain, R. L., (2005). The long-term effects of ability grouping in mathematics: A national investigation. Social Psychology of Education 8:137–177.

These studies illustrate an ideological assumption that is rarely if ever justified, namely the assumption that ideally everyone should study lots of mathematics and feel good about their mathematical ability. In my view this assumption is highly irrational. It is not in anyone’s best interest that students overestimates their own abilities or that they are strung along in course after course in which they learn just about enough procedural nonsense to scrape by with a passing grade.

It is easy to see how inflating a student’s confidence can have disastrous effects. This student may very well be struggling in other subjects as well, so if he is pampered in his mathematics class he may be led to the misconception that this is his area of strength. Thus he will keep taking mathematics courses until he finally realizes that although he managed to get by in each individual course he does not have the level of understanding necessary to do anything meaningful with his mathematical coursework, such as pursuing a degree in a STEM field. Thus his efforts in trying to do well in his mathematics courses, and perhaps even taking electives, have been wasted. He would have been better served by having realised this earlier, so that he could focus on another academic path, rather than being misled by feel-good mathematics classes.

This, I say, is the type of consideration that the studies by Catsambis et al. (2001) and Mulkey et al. (2005) fail to take into account. They simply assume that higher confidence and willingness to take more mathematics is a good thing. The overall conclusions of these studies are largely the same; we may quote from the former for definiteness:

“In sum, based on our study, we conclude that the effects of tracking focus on ‘conferring status’, thus supporting the aphorism, ‘It is better to be a big frog in a small pond than a small frog in a big pond’.” (Catsambis et al. (2001), p. 105)

That is to say, students form a self-image by comparing themselves to their peers. Thus, for example,

“Tracking has a strong, but negative, association with certainty of high school graduation for all students with a high-track propensity. … The opposite is true for males and females with a low-track propensity who remain more certain of their high school graduation. … Similar results are found for students’ college plans.” (Mulkey et al. (2005), p. 159)

Thus the effect of tracking is that “students with a propensity for a high track are negatively affected whereas students with low track propensity are positively affected,” since “eventually, academic self-concept figures into test scores and grades” (Mulkey et al. (2005), p. 165).

This is taken to be an argument against tracking: “This pattern suggests that tracking’s positive instructional effect is attenuated by indirect social mechanisms” (Mulkey et al. (2005), p. 165), for “when males are grouped with peers of similar high ability, they lose their competitive edge, and it becomes difficult for them to realize their positive attributes” (Catsambis et al. (2001), p. 103). “So, for highly tracked males the negative academic self-concept may unintentionally depress future performance if it results in the avoidance of taking elective, advanced math courses” (Mulkey et al. (2005), p. 144).

But none of this need to be a bad thing. As I argued above, overestimating one’s talent for mathematics may be extremely harmful in the long run, even though in the short term it may be beneficial in all respects. The fact that “eventually academic self-concept figures into test scores and grades” may suggest to some people that we should increase everybody’s mathematical self-esteem. Even if this would indeed lead to better results, it can still be harmful. For it may be that it leads to better results only by tricking students into thinking that their future lies in a mathematical field. No wonder then that they work harder in their mathematic classes and get better grades: they do so with the understanding that this will eventually be relevant for their future career path. Once they reach the realization that they are not suited for a career focused on mathematics they will be bitterly disappointed and feel that they have been misled into focusing on mathematics under the impression that they were especially good at it. And yet on superficial measures this type of student is a “success”: they work hard, obtain high scores relative to their ability, and take elective mathematics courses.

I reiterate my point that the equity-based arguments against tracking run the risk of assuming that higher achievement and self-esteem among students is necessarily a good thing. Instead, I have argued, high achievement and self-esteem can be “bought” by dishonest inflation of students’ self-images which are ultimately detrimental in the long term.


Shores & Shannon, The Effects of Self-Regulation, Motivation, Anxiety, and Attributions on Mathematics Achievement for Fifth and Sixth Grade Students, School Science and Mathematics, Volume 107, Issue 6, pages 225–236, October 2007.

A study of 761 Alabama fifth and sixth graders using an extensive Likert-type questionnaire. Regression analyses showed that motivation and anxiety were correlates of achievement in the expected ways. Reasonable people might conclude that poor performance leads to anxiety and high achievement leads to higher levels of motivation, which is hardly something we need “research” to tell us. But, alas, the situation is much worse than proving something obvious. Instead the authors conclude that “academic achievement is effected [sic] by such factors as motivation [and] anxiety” (p. 231), thus committing the elementary fallacy of taking correlation to imply causation. There is no basis in the study for such a causal inference. Presumably the researchers prefer it to the harsh reality of the common-sense interpretation because it legitimises the kind of feel-good, PC movement discussed above.


Speer, N. M., & Wagner, J. F., (2009), Knowledge needed by a teacher to provide analytic scaffolding during undergraduate mathematics classroom discussions. Journal for Research in Mathematics Education, 40(5), 530–562.

An investigation of the pedagogical content knowledge needed by teachers to constructively guide classroom discussion. The authors’ first episode concerns a classroom discussion on the problem on modeling “a continuously reproducing species of fish in a lake” (p. 542). The problem posed was:

“This situation can also be modeled with a rate of change equation dP/dt=something. What should the something be? Should the rate of change be stated in terms of just P, just t, or both P and t?” (p. 542)

Of course the “right” answer is dP/dt=kP. According to the authors, “understanding the direct dependence of [the] differential equation on P but not on t is a conceptual challenge for students to overcome”––in fact, it is “the central conceptual challenge” (p. 543). I say that it is the authors themselves who have a deficient understanding of the situation. They claim that it is “not the case” that “dP/dt [is] expressible in terms of t,” since “expressing dP/dt solely in terms of t (dP/dt=f(t)) would indicate that the rate of change of the population is independent of the size of the population, which is not a reasonable assumption for any living species” (p. 543). This is nonsense. No such assumption is “indicated” by giving dP/dt in terms of t.

In fact, the “right” model dP/dt=f(P) and the “wrong” model dP/dt=f(t) are both equivalent in this case, since the statement of the problem explicitly concerns “a continuously reproducing species of fish” and says that it is “this situation” that is to be modeled. This suggests that we are dealing with a specific P, which of course can be expressed as a function of t without any loss of generality. Of course it would be different if one were looking for a general population modeling equation, but that is not the problem posed. So I would say that “the central conceptual challenge” is rather that the course materials asks one question and expects the answer to another. The authors continue:

“In other words, realizing that an initial condition is irrelevant to the question posed (i.e., that the differential equation must hold for all possible initial conditions simultaneously) is a challenging learning objective for the students as they work through this activity.” (p. 544)

No wonder! It is bound to be very challenging indeed, since the course materials have gone out of their way to emphasise that the problem concerns a specific type of fish in a specific lake etc. (though perhaps stopping short of calling it a specific population). Here’s a piece of analytic scaffolding for you: if you want a general answer, pose a general question.

A second episode in the discussion of the same problem concerns the distinction between dP/dt=P and dP/dt=e^t as possible models. The “right” thing to do is of course to “see that P(t)=2e^t satisfies only one of them” (pp. 546-547). According to the authors an opportunity to make this point presented itself naturally in class discussion. Namely, “Rob’s observation … was closely related to [this] point” (p. 547), and “could have been highlighted and clarified for the whole class,” which “might have allowed the class to distinguish between the two differential equations under consideration” (p. 546). “Unfortunately, Rob made the unhelpful suggestion, ‘say your P(t) was P+t’” (p. 547). This is “unfortunate” and “unhelpful” only to narrow minds who have already decided what the “right” outcome of the discussion is (thus alleviating the need for a discussion in the first place). Much more plausible than the authors’ interpretation is the interpretation that Rob was simply looking for a function that is not its own derivative: in fact, he is offering the simplest possible variant of an exponential P that will fail to be its own derivative. Thus the authors’ desired point regarding e^t versus 2e^t is completely irrelevant since both these functions are indeed their own derivatives.

The rest of the authors’ interpretations of their data is also flawed. On one occasion, for example, they pointlessly discuss what the teacher might have done with a student contribution which was admittedly inaudible to him (p. 556). But let us leave these issues aside and consider what the authors purport to conclude from their study.

When we turn to the conclusions section we find the worthlessness of the study confirmed anew, as we read nothing but truistic fluff such as the following:

“We do believe … that teaching expertise in reform oriented practices of this sort is enhanced as teachers develop the types of knowledge considered here” (p. 558),

where the “types of knowledge” in question are synonyms of good teaching,

“such as knowledge of typical ways students think (correctly and incorrectly) …, knowledge of the curriculum in use, and knowledge to support the specialized type of mathematical work teachers do when dissecting and analyzing students’ expressions of their ideas.” (p. 558)

Basically, then, the authors have made the astounding discovery that knowing how to teach well is positively correlated with teaching well.


Dahl, B., (2004). Analysing cognitive learning processes through group interviews of successful high school pupils: development and use of a model. Educational Studies in Mathematics, 56, 129–155.

This article is an attempt to introduce and support a cognitive model of mathematical learning. I this review I shall argue that her enterprise is ill-conceived and that her conception of theoretical research is naive.

Dahl’s model is called CULTIS for the “collection of themes” (p. 134) that constitute it: “Consciousness – Unconsciousness; Language – Tacit; Individual – Social” (p. 134). The first thing to note here is that the “model” proposed by Dahl is nothing but a “collection of themes.” This in itself makes a number of the claims she makes for the value of her theory highly suspect. Before looking at specific examples of this, I may illustrate my point by an analogy. Suppose someone proposed a “HLHC” theory of physics, which was nothing but a “collection” of the “themes” Heavy – Light; Hot – Cold. It is of course an undeniable fact that the constituent “themes” of the HLHC model are crucially important concepts in physics, and that if you sit down in a physics classroom you will find that much of what is being said relates to these categories. But this does not imply that pasting them together and giving them an acronym is of any value whatsoever, especially if the importance of these concepts has been well-known for a long time.

I say that precisely this error is being committed by Dahl. She assumes that because each of her “themes” show up regularly, bundling them into a “model” and giving it an acronym somehow constitutes and advancement of theoretical understanding of mathematical cognition. And this despite the fact that each of the “themes” has been given much attention previously; indeed that is how Dahl herself says that she came up with them: “CULTIS was created after systematically going through [previous] theories noticing which themes they brought up” (p. 134).

As an illustration of Dahl’s misconceptions in this regard, we may note the following quotation: “the distance between what the pupils said and the theories is not big which supports the explanatory power of the theories” (p. 152). Dahl has not learned the elementary lesson that descriptive accuracy is not the same thing as explanatory power. If my HLHC theory of physics says that heavy objects fall down if dropped then the theory is descriptively accurate but obviously it has no explanatory power whatever; it merely restates a phenomenon without explaining anything.

Another example may illustrate that the alleged “explanatory power” of Dahl’s theory is of precisely this vacuous type. Some students in the study said that lack of familiarity with a given teaching style may hamper learning:

“D: When I first came here [to the new school], the first couple of weeks I found math very difficult because it is kind of hard to adapt to a different teaching style.” (p. 151)

According to Dahl,

“This phenomenon might be explained by stating that the teaching method should be within what I henceforth will call a zone of proximal teaching (ZPT). … If a (new) teacher uses teaching methods that are too ‘far away’ from what the pupil is used to, the pupil may not learn.” (p. 151)

Again Dahl appears to be under the delusion that to explain something is to restate it in a pompous way and to give it an acronym, because her proposed “explanation” adds nothing but pretentious verbiage to the statement of the student.

But even if we reject Dahl’s pretensions to offer an explanatory theory, one might argue that her model nevertheless has merit as a useful synthesis of previous theories. I maintain, however, that this is not the case. The way the themes are thrown together in the CULTIS model is haphazard and lacking in sense and motivation. As an illustration of this, let us consider the role of visualization. Dahl sorts this under “Individual,” contrasting it with “Social” (p. 140). A myriad of obvious problems with this arrangement suggests itself immediately, all of which are ignored by Dahl. Why should visualization be grouped with “self-activity” (p. 140)? Isn’t visual thinking better contrasted with the theme “Language” than with “Social”? The antipode of “Language” is “Tacit”, which is described as the attitude that the pupil “cannot tell but only show” (p. 140), which sounds almost synonymous with visualization. Indeed, the descriptions of “Tacit” and “Individual” are confusingly similar. As a characterisation of the “Tacit” theme we read:

“It is therefore actions that form the roots of logical and mathematical thoughts.” (p. 137)

But then the theme “Individual” is described in virtually the same terms:

“Thus the logical-mathematical abilities do not arise from language or linguistic competency, but from the ability to coordinate actions.” (p. 137)

Dahl’s arbitrary bundling of visual with individual makes her analysis heavily theory-biased. Her “Pupil C” said:

“I tend to learn more when visual or rather than just [A coughs] look at it [A & C laugh, some words are missed]. Yea, I’ll try to make it more visual, like that (I: mmm).” (p. 147)

From here Dahl concludes that “Pupil C … tells that he is ‘more visual’, which is individual learning” (p. 147). Obviously the data does not support this identification of visual learning with individual learning. Although the transcript contains much useless information about who was coughing when, the crucial passage where Pupil C explains what he is contrasting visual thinking with (social? linguistic?) is “missed.”


Brown, S. A., Pitvorec, K., Ditto, C., & Kelso, C. R. (2009). Reconceiving fidelity of implementation: An investigation of elementary whole-number lessons. Journal for Research in Mathematics Education, 40(4), 363–395.

This study investigates to what extent elementary school teachers working within a Standards-based curriculum are faithful to the spirit of this material in their teaching. The authors summarize their findings as follows:

“we (a) concluded that the level of fidelity to the literal lesson does not determine the level of fidelity to the authors’ intended lesson, and vice versa; (b) observed that individual teachers’ enacted lessons tend to have some consistency in their ratings for level of fidelity to the authors’ intended lesson; and (c) identified two lesson types––lessons for which enactments varied by teacher and lessons for which the [level of fidelity] rating for the enactments appears to be related to the lesson itself.” (pp. 389–390)

I say: these results are all truistic, and consequently the study is next to worthless. To substantiate this claim, let us consider the results in order.

The truistic nature of (a) is readily apparent once we note that “the literal lesson” means the written curricular materials and “the authors’ intended lesson” refers to their “underlying philosophy” (p. 369). Surely it is not surprising that overworked teachers sometimes rely on the pre-fabricated “literal lessons” without reflecting very much on their “underlying philosophy.” Nor is is surprising that, conversely, the “underlying philosophy” may be read into a lesson of a teacher who strays from “the literal lesson” since such a teacher may quite plausibly have a similar philosophy herself, regardless of the curricular materials. The chances of such accidental compliance with “underlying philosophy” are especially marked since Brown et al. characterize the “underlying philosophy” in terms of extremely general “opportunities to learn,” such as “opportunities to reason to solve problems; opportunities to reason about mathematical concepts” and “opportunities to validate strategies or solutions; reason from errors; inquire into the reasonableness of a solution.” Clearly, such “opportunities to learn” are bound to arise in any mathematics classroom regardless of the “underlying philosophy” of the curricular materials.

In (b) we have another unremarkable result, namely that not all teachers are equally enthusiastic about the prescribed curriculum. It would have been stunning indeed if the researchers had found that a teacher’s background, training, etc., had no consistent impact at all on her attitude towards the curricular materials she had to use.

As for (c), this may sound like an interesting result––what type of lessons are teachers enthusiastic about?––but we are disappointed to find that this question is left unanswered except by mere equivocation: the “type” of lesson of lesson that received consistent level-of-fidelity ratings is not independently characterized but rather simply defined as the set of all such lessons (p. 387). So all (c) really says is that sometimes level of fidelity appears closely related to the content of the curricular material for that lesson and sometimes not. The most plausible interpretation of this result is surely that the curricular materials are (i) not all of the same quality, (ii) not all equally manageable or realistic to implement, (iii) not all equally accessible to typical teachers. Again, nothing about this is the least bit novel or enlightening.

Having dismissed the concrete results of the study, we must recognize that the authors purports to have made a theoretical advance as well, viz. “by providing a framework with which to view teachers’ enactment of lessons that connects students’ engagement in opportunities to learn mathematics with with those intended by the curricular materials” (p. 390). A “framework” is only as good as the insights it yields––it is not an end in itself, as the authors seem to imply, that it “contributes to [a] growing body of research” (p. 390)––so I think we are justified in ignoring it until the authors have proved it fruitful. Even so, we could not help but noticing above why the extremely general and vague notion of “opportunities to learn” is bound to be virtually useless for analyzing the specific intentions of authors of curricular materials.

I should also like to point out that the authors’ notion of “research” is a very twisted one. The authors are apparently convinced that the only way to “research” these “teaching beasts” is to record their guttural cries in the wild, and then have rational men reconstruct their savage mannerisms on the basis of this data. Why not simply talk to the teachers about why they choose to implement certain aspects of the curricular materials but not others? (While the notion of talking to the teaching beasts was apparently inconceivable to the researchers, textbook authors were considered rational animals, for “if the authors’ intended lesson was not clear, we … asked for clarification from the authors” (p. 382).)

Ignoring this common-sensical approach, the authors instead choose to focus their “researcher’s lens” on a minuscule data set, namely 33 video-recorded classroom lessons from a total of 14 teachers (p. 375); i.e. an average of 2.4 lessons per teacher. Anyone who has ever taught more than 2.4 lessons knows perfectly well that an unfortunately chosen sample of that size could easily lead to innumerable mistaken impressions; especially so when the unit of analysis (“opportunity to learn”) is absurdly abstract and most likely far removed from the terms in which the lesson was conceived. Furthermore, as bad as it would have been if the measly 2.4 lessons were a random sample, they were apparently not even that, for “several teachers expressed that, since they were being observed, they felt compelled to teach the lesson as written” (p. 389), thus nullifying any remaining shred of credibility that the data may have had.


Vale, C. M. & Leder, G. C. (2004). Student views of computer-based mathematics in the middle years: does gender make a difference? Educational studies in mathematics, 56(2), 287–312.

The authors purport to have found significant gender differences in attitudes towards computer-based mathematics classes. After discussing the wider context of this study, I shall, for lack of space, focus my critique solely on its quantitative part, which I say should be dismissed owing to the stupidity of the researchers’ methodology.

It is natural to situate this study in the context of the NCTM Principles and Standards. This document’s “Equity Principle” notes that different students thrive under different conditions, expectations and stimuli, and calls for each student to be nurtured accordingly. Insights regarding gender differences are thus highly pertinent for successful implementation of this principle. However, the study by Vale and Leder can be criticized for falling short of this goal in several regards. One issue is that Vale and Leder focus almost exclusively on description of the phenomenon, offering little of substance for improved practice. Also, their study begins to partially diverge from the Principles and Standards when it comes to the specific uses of computers in the classroom. Vale and Leder studied two groups of students. From what we learn of the first group, their use of computers seems to be a near-perfect enactment of the vision of the Principles and Standards. In particular, they focused on using Geometer’s Sketchpad to form and test conjectures (p. 293)––a theme emphatically emphasized throughout the Principles and Standards. But the second group of students seem to have focused on computers for entirely different reasons. A telling illustration is their use of PowerPoint to present solutions to simple linear equations (p. 293). It is hard to imagine a mathematical rationale for this; the purpose seems rather to have been preparation for the business world.

This considerable discrepancy in the uses of computers raises the issue of whether it makes sense in the first place to talk about “attitudes to computer-based mathematics” in the abstract. No one would dream of drawing conclusions about “students’ attitudes towards the use of books in mathematics classes” based on a study involving one or two books. Whether “computer-based mathematics” is a cohesive enough concept to warrant such conclusions in this case is highly questionable, and, in any case, an issue never touched upon by Vale and Leder. Thus one may question whether the research question makes any sense in the first place.

But let us now turn to the results of the study itself. As I said, I only have room to discuss the quantitative aspect of it. The researchers considered a number of quantitative parameters but obtained statistical significance in only two cases: first, a predictable and quite uninteresting correlation between “achievement in computing” and appreciation of computers in mathematics classes (pp. 306–307), and, secondly and potentially more interestingly, a significant gender difference in attitude to computer-based mathematics. The latter was measured by asking the students to indicate their agreement or disagreement with the following statements on a Likert scale.

“[A] I’ve improved in maths since we started using computers in maths.
[B] I’ve gone backwards in maths since we started using computers in maths.
[C] I am sure I could do difficult maths with the use of a computer.
[D] Even a computer can’t help me learn maths.
[E] Using a computer in maths gives you a reason for doing maths.
[F] Using a computer in maths does not make maths any more useful.
[G] I find that using computers helps me to learn maths.
[H] Using computers in maths means you won’t be able to do maths without them.
[I] Maths is easier to understand when you use computers.
[J] Using computers in maths makes maths more confusing.
[K] Computers are excellent for doing things for maths.” (p. 297)

I say that the results of this study should be disregarded because of the sheer stupidity of this list. A number of the entries are in effect guaranteed to be true or false, thus being completely useless for the purpose of the study. For example, A and B are bound to be true and false respectively for any student whose teaching is not only non-progressive but in fact outright detrimental to learning. So A and B are useful only for comparing computer classes to such detrimental teaching or no teaching at all, neither of which is a realistic alternative for the purposes of this study. So results about A and B should be discarded as saying nothing of interest in reply the research question at hand.

Another entry on the list that is trivially true is K. This statement has nothing to do with teaching. It merely asserts a fact about the capacities of computers that no sane person could possibly dispute. Obviously the researchers intended K to be understood differently, but since it is trivially true in its literal meaning all student replies must be discarded for this entry as well.

The same goes for C. Presumably the researchers intended this statement to be interpreted as “I am sure of my current ability to do difficult mathematics with a computer.” But C could just as well be interpreted as analogous to “I am sure I could go on a diet and loose 40 pounds (though perhaps I do not think it worth the effort),” in which case it would be trivially true, so again all replies must be discarded.

Similarly, H is trivially false. Of course using computers in mathematics does not mean that you won’t be able to do mathematic without them. It may be that the use of computers has such a detrimental effect in many cases, but that is not what is being asked. As it stands, H is plainly false so all replies must be discarded.

Finally, D is a very stupidly formulated statement. The “even” insinuates that if a computer cannot teach you mathematics then no one can. In order to agree or disagree with this statement one must in effect accept this implicit premise. Therefore, since this premise is obviously highly dubious, not to say outright dumb, all answers to this statement must be discarded.

In all of these cases, then, the statistical significance obtained need not have anything to do with the issues at hand. Instead the gender differences may be due exclusively to differences in interpretation of the questionnaire, viz., the difference between interpreting it literally or second-guessing what the researchers intended to ask.

I conclude therefore that the only legitimate entries on the list are E, F and G. Since the researchers do no disclose their data for replies to individual statements, we have no reason to believe that the statistical significance declared still hold for these three entries. Therefore we should disregard the results.


Yin, Y., Shavelson, R. J., Ayala, C. C., Ruiz-Primo, M. A., Brandon, P., Furtak, E. M., Tomita, M.K., & Young, D.B. (2008). On the Impact of Formative Assessment on Student Motivation, Achievement, and Conceptual Change. Applied Measurement in Education, 21(4), 335-359.

A study of 12 middle-school science classes, half using formative assessment and half not. All classes had a common curriculum, from which one particular unit was selected for the study. Treatment group teachers were provided with some sort of formative assessment training and materials specific to this unit, neither of which are described in this article (the authors refer to a different paper for details). Pretest and posttests were administered to test the effect of formative assessment on motivation, achievement and conceptual change, but no significant effect was observed on any of these measures. The inconclusive results of the study were not unexpected considering the innumerable confounding factors at play, which included spontaneous use of formative assessment by control group teachers, failure to implement the formative assessment materials among treatment group teachers, enormous discrepancies in class time devoted to the unit (varying from 63 to 249 days), severe ESL issues in some classes, and differences in standards and standardized testing since the classes were from several different states.


Remillard, J. T., & Jackson, K., (2006). Old math, new math: parents’ experiences with standards-based reform. Mathematical Thinking and Learning, 8(3), 231–259.

This is a study of “how African American parents in a low-income neighborhood experience … current reform efforts” in mathematics (p. 231). The ultimate goal is to make parents “partners in mathematics education reform,” rather than “stumbling blocks,” as they are sometimes portrayed (p. 233), since parental involvement has been show to have a significant positive impact on student achievement (p. 232). The present research project, however, sets itself the more modest goal of giving a phenomenological description of parents’ experiences in the hope that this will be “the first step in conceptualizing ways to include and support parents as partners in their children’s mathematics education” (p. 233).

My review shall focus on sources of bias in this research. I shall provide evidence that: (1) the authors have a dangerously uncritical conviction that the reform materials are wholly positive, (2) their conception of parents’ attitudes as almost entirely conditioned by personal school experience is not justified, (3) the study suffers from selection bias relative to the research goal as stated above. The authors themselves do not address these problems.

To illustrate points (1) and (2) we may consider how the authors deal with the fact that “none of the parents saw the connections between the mathematics represented in EM and the mathematics of their everyday lives” (p. 245; EM stands for Everyday Mathematics, a curriculum based on the NCTM Standards used in the school in question). Note the definite article: “the connections.” The authors take it for granted that the EM is excellent and full of such connections that are “readily apparent” (p. 255). Thus they must proceed to explain why the parents cannot see what is “readily apparent.” They propose as an explanation that the parents “had firmly established conceptions of school mathematics that were grounded in computational proficiency” (p. 255). But then the authors quickly go on to contradict themselves later on the same page: “[M]any of the learning goals that they held for their children overlapped with those central to EM. Parents described wanting their children to develop confidence, independence, and the ability to use math in their everyday lives. Several parents spoke of wanting their children to develop a deep understanding of math” (p. 255). As one parent put it, she wanted her daughter to “use her brain cells” (p. 253).

So how do the authors confront the fact that their own explanation regarding why parents were critical of EM material is here blatantly contradicted? They simply dismiss this contradiction as “ironic” (p. 255)! This seems to me revealing of the authors’ inability to conceive of the possibility that the parents’ criticisms have any merit. If the parents said they didn’t like the school cafeteria food or that the basketball coach was poor, then we wouldn’t be very good researchers if we dismissed their opinion as “ironic” just because we happened to like the food and the coach. Instead of contemptuously rationalizing the parents’ criticism as based on prejudice, we should listen to their arguments and see what merit they have.

The parents did indeed put forth arguments based on reason rather than prejudice. For example, one parent argued that “if I was teaching [my daughter] Serahn how to drive, I couldn’t do it in January and then one day in March and then in June she’ll go take the test, and pass it. It don’t work that way, not even with the math.” (p. 248). This is an intelligent and substantial critique of EM, not an inability to see its “readily apparent” virtues due to preconceived notions of what school mathematics should be. One could give many more examples, which is not surprising since “old math” was advocated by intelligent people at its time. Not everyone’s approval of “old math” can be ascribed to bias conditioned by their own education, on pain of infinite regress, since “old math” is not so old as to have existed since the beginning of time.

How can the authors claim to aim at “conceptualizing parents as partners in mathematics education” (p. 257) if they do not take such critique seriously? They agree that this goal “requires serious consideration of parents’ genuinely felt rejection of reforms” (p. 257). Genuinely FELT, that is the key word: this is a matter of emotion rather than reason, according to the authors. The authors apparently considers themselves very gracious in that they “contend that parents have the right to disagree with the reforms” (p. 257). Maybe opportunities to get their “feelings” off their chests will make these irrational little creatures feel better, the authors seem to hope.

I think the spectrum of debate here is revealing. The underlying assumption seems to be that “reform” is great and that, therefore, parents’ critiques of it must be incompetent. Given these assumption, one can debate whether or not parents should be allowed to speak at all (let alone be listened to, which is of course out of the question). The fact the the authors consider themselves on the liberal end of the spectrum and feel that parents’ right to disagree is something for which one needs to “contend” is a depressing indicator of how deeply ingrained these underlying assumptions are.

Finally, I should like to point to a separate issue of bias in this research, namely selection bias. The authors were careful to avoid selection bias with respect to variables such as age, education, employment, etc. (p. 239). Yet they ignored what is arguably the most crucial variable for the purposes of the present study: involvement. Here we have the surely very exceptional state of affairs that “All 10 parents were heavily involved in their children’s mathematics learning beyond homework assistance” (p. 254). Given that the research is justified by the significant impact of parent involvement on achievement, it is questionable whether studying already highly involved parents contributes to the stated goals of the research.