What makes a good axiom?

Podcast: Download

How should axioms be justified? By appeal to intuition, or sensory perception? Or are axioms legitimated merely indirectly, by their logical consequences? Plato and Aristotle disagreed, and later Newton disagreed even more. Their philosophies can be seen as rival interpretations of Euclid’s Elements.

Transcript

What kinds of axioms do we want in our geometry? How do you tell a good axiom from a bad one? Should an axiom be intuitively obvious? Should it be empirical, physically testable? Should it be logically self-justifying, or are axioms logically arbitrary?

The time has come to take a stand. As we have been reading Euclid backwards, we have seen how the Pythagorean Theorem can be reduced to a theorem on the areas of parallelograms, and how this theorem in turn can be reduced to triangle congruence. So now we have to prove triangle congruence somehow.

If two triangles have the same side-angle-side, then they are the same triangle. How to prove such a thing? We can’t keep playing our game of reducing every theorem to a simpler one, because we’re running out of “simpler.” Maybe this theorem is as simple as it gets? Maybe it’s the rock bottom? How do you decide anyway what’s simpler than what? It’s becoming more philosophy than mathematics to answer these questions.

This theorem—side-angle-side triangle congruence—is Euclid’s Proposition 4. Ok, so it’s a proposition, not an axiom. So apparently he has reduced it to something. But what?

Let’s read the proof. So we have two triangles, and they have certain measurements in common. Two sides of one triangle are equal to two sides of the other, and also the angle between those sides are equal in both triangles.

Euclid says he can prove that the other measurements are equal too. The remaining side, the remaining angles: it’s all equal. They’re the same triangle basically.

And here’s how Euclid says you can prove this. Take one of the triangles and put it on top of the other. We know that they have side-angle-side in common, so those parts line up perfectly. These three attributes are enough to “lock” the entire triangle into one unique shape, in fact.

Because suppose it wasn’t. Suppose the two triangles were different. Since they have side-angle-side in common, they lined up at least on those parts. This “locks” into position two of the sides and all three of the vertices. There is no way one of the triangles can stick out beyond the other in terms of these two sides or in terms of any one vertex. So the only way the triangles could be not equal would be if the third side somehow missed.

This would mean that the endpoints of the third sides were the same for both triangles, but the line joining them would be different. Impossible! You can’t have multiple lines connecting the same two points. Or Euclid puts it: two straight lines cannot enclose a space. You can’t draw a straight line from A to B, and then another straight line from A to B, in such a way that these two lines miss each other and have some space in between them.

Since this is impossible, the third sides of the triangles must line up on top of each other, and therefore the two triangles are identical, or congruent. That’s the proof.

Once again the point of the proof is not to convince us that the theorem is true, but to reveal how its truth can be reduced to more basic truths. Euclid has now taken this as far as he can. We’re all the way down to the axioms: things that cannot be broken down any further.

The proof of the triangle congruence theorem rests most prominently on two axioms. One, as we saw, is that “two lines cannot enclose a space.” Which is equivalent to saying that, for any two points, there is only one straight line between them. This corresponds to Euclid’s Postulate 1, which states as an axiomatic principle that we can “draw a straight line from any point to any point.” It is understood that this line is unique. That is to say, there’s only one way you draw that line. So that’s an axiom. You can’t reduce it any further.

But there was another axiom involved as well in our proof of the triangle congruence theorem. Namely the assumption that we could put one triangle on top of the other. This corresponds to Euclid’s Common Notion 4: “things coinciding with one another are equal to one another.”

This is basically a definition of equality. What does it mean for two things to be equal? Put one on top of the other. If neither sticks out beyond the other, then they are equal. That’s what equal means. In fancier words you could say: equality means alignment under superposition. So that’s another axiom that Euclid states at the beginning of his work, and which he cannot prove from more basic principles.

What should we make of these two axioms? Since we can’t prove them from other things, they must be justified some other way. What way would that be?

Euclid apparently thought these two principles were especially suited to be axioms. He could have done it differently. He could have chosen other axioms. For example, the triangle congruence theorem itself could have been taken as an axiom. That’s what Hilbert later did, in his modern and very authoritative axiomatisation of geometry. So from the point of view of modern mathematics it makes a lot of sense to take the triangle congruence principle as axiomatic. From a logical point of view that’s perhaps the best approach. Modern logicians don’t like Euclid’s proof one bit. Bertrand Russell called it “logically worthless.” If you want mathematics to be logic, then that makes sense.

But what is “good” mathematics? That depends on your philosophy of mathematics. You must first decide what kind of thing mathematical knowledge is. What it should be. Only after you have made that philosophical decision do you have any basis for judging whether Euclid’s approach is better or worse than that of others.

Euclid’s choice of superposition as an axiomatic principle is quite interesting in this regard. It seems almost physical or empirical. In the proof of the triangle congruence theorem, you are literally, physically picking up one of the triangles and placing it on top of the other triangle. This seems to assume that triangles are physical objects, like cardboard cutouts or some such thing. And the idea that equality means alignment under superposition also has a somewhat physical feel. The thing fits on top of the thing. It’s something you could test practically, in the real world.

The modern authors I mentioned do not approve of these connotations. They don’t like it one bit that mathematics is so to speak contaminated by empirical considerations. They want mathematics to be pure reason. They don’t want it to depend on sense perception and physical experience.

But Euclid’s use of superposition suggests that he was less dogmatic about this. It could be interpreted as a sign that he was open to the idea of geometry as ultimately physical.

Of course geometry is still very theoretical. Obviously, to Euclid, you can’t justify things like the Pythagorean Theorem just by measuring things, the way you would verify a physical law by making a bunch of measurements in a lab. Of course geometry is not like that.

But the fact remains that the axioms cannot be justified by the axiomatic-deductive process itself. What axioms are the “right” axioms, or the “best” axioms, is a question that cannot be answered by purely mathematical means. Some philosophical assumptions will necessarily be involved in such judgements.

I wanted to use this as a bridge to discuss some Plato and Aristotle. I’m trying to emphasize how these things go together. Mathematics and philosophy. Reading Euclid leads naturally to philosophical questions. We reduced the Pythagorean Theorem down to superposition and uniqueness of lines. We faced the questions: Why stop there? Why these principles and not others? What kinds of foundations should geometrical knowledge be built upon?

This is the right time to read philosophy, with these burning questions fresh in our minds. Mathematics itself does not answer these questions. As Aristotle says in the Posterior Analytics: “for the principles a geometer as geometer should not supply arguments.”

So there is a kind of division of labor. Justifying the axioms is not the business of the geometer “as geometer.” But of course Aristotle didn’t mean by this that you should have mathematicians over there and philosophers over here and there’s no point for them to talk to each other. A better way to read it, I think, is this: geometers, as geometers, cannot justify their axioms, and therefore any geometer needs to be a philosopher as well.

Aristotle discussed the axiomatic-deductive method at length in this treatise, the Posterior Analytics. Here’s a quote that sums up his view: “Demonstrative understanding must proceed from items which are true and primitive and immediate and more familiar than and prior to and explanatory of the conclusions.”

Quite a list of demands! Axioms, such as those of geometry, should have all of those characteristics, according to Aristotle.

Obviously this means that the axiomatic-deductive method is a whole lot more than merely logical deductions from arbitrary assumptions. Indeed Aristotle says as much: “There can be a deduction even if these conditions are not met, but there cannot be a demonstration––for it will not bring about understanding.”

This places very significant restrictions on what could be a legitimate axiom in geometry. It must be “primitive and immediate and more familiar than and prior to and explanatory of the [theorems].” So axioms need to be self-evident, in other words, it seems. That’s more or less what Aristotle means by “immediate,” I suppose. And axioms must also be irreducible, not in turn derivable from some other principle. That seems to be the meaning of Aristotle’s demand that they be “primitive” and so on.

It gets pretty interesting when Aristotle elaborates further on what he means by some of these terms, because then he commits himself to the perhaps controversial stance that axioms are ultimately grounded in physical experience. Here’s what he says: “I call prior and more familiar in relation to us items which are nearer perception.” So immediate perception must be the ultimate foundations of “demonstrative understanding.” Not pure thought, but sensory perception.

The axioms are generalized or idealized facts of experience. As Aristotle says: “We must get to know the primitives [that is to say, axioms] by induction; for this is the way in which perception instills universals.” For instance, for any two points there is a unique line connecting them. This is fact of experience, but of course generalized––“by induction,” as Aristotle says. That is to say, we have observed this in many examples. For this particular pair of points there’s a unique line, and for that pair, and so on. These are facts of perception. And then “perception instills universals by induction”: that is to say, we generalize from these examples to the general principle that the principle will work for any two points, not just the numerous examples we have witnessed.

So Aristotle thinks the axioms of geometry ultimately come from concrete experience. The credibility of the axioms, the certainty of the axioms, derives from immediate sensory experience.

This fits pretty well with the principles to which Euclid reduced everything. It is known through experience that there is a unique line from any point to any point. For instance by pulling a string between two points you can get a very direct sensory feeling for the existence and uniqueness of that line. And the principle of superposition, of putting one triangle on top of the other, can likewise be seen as an idealized version of a very immediate and basic physical experience.

But not everyone agreed. Plato is the opposite of Aristotle. He has complete contempt for the physical world, and he loves mathematics precisely because it is something purer and higher than physical experience.

Let me quote Proclus expressing this view. Proclus is a follower of Plato. He is keen to argue that mathematics stems from the soul, not sense experience. He addresses the Aristotelian view, and he sums it up like this: “Should we admit that [the objects of mathematics] are derived from sense objects, either by abstraction, as is commonly said, or by collection from particulars to one common definition?” That’s what Aristotle had argued, but Proclus says: No, we should not accept that.

And here’s why. Geometry cannot be based on physical experience, Proclus says, because “The unchangeable, stable, and incontrovertible character of the propositions [of mathematics] shows that it is superior to the kinds of things that move about in matter. And how can we get the exactness of our precise and irrefutable concepts from things that are not precise? We must therefore posit the soul as the generatrix of mathematical forms and ideas,” not physical reality.

Plato was quite obsessed with this idea that pure thought is the highest and most noble thing in human life. In the Timaeus he elaborates on this idea in a rather amusing and poetic way. To philosophise is the purpose of life. Human anatomy is merely an appendix to the soul and the mind. “The entire body” was created “as its vehicle,” Plato says. That is to say, the body exists only to make philosophising possible.

For example, consider the intestines of the human digestive system. They are very long and winding, right? Like you roll up an extension cord when putting it away in a drawer; it looks like that in our insides. Food doesn’t go in a straight line from the mouth and out the other end. Instead the body passes it through the intestines that go back and forth, back and forth, a very long distance.

Plato thinks he knows why. Here’s how he explains it: “The intestines are wound round in coils to prevent the nourishment from passing through so quickly that the body would of necessity require fresh nourishment just as quickly, there by rendering it insatiable. Such gluttony would make our whole race incapable of philosophy and the arts, and incapable of heeding the most divine part within us.”

So the human body is just a means to an end. The only thing worth anything is philosophy. Eating doesn’t have any value in itself. The only purpose of eating is to put off the annoying needs of the body for a while, so as to give us time to think.

Plato has a similar theory regarding eyesight. To Aristotle, the senses were a source of knowledge. The foundations of geometry rested on sensory experience. Of course Plato disagrees. The purpose of eyesight is just like that of the intestines: it’s just a physical crutch whose ultimate goal is to support pure philosophy. Here’s how Plato puts it:

“Our ability to see the periods of day and night, of months and of years, of equinoxes and solstices, has led to the invention of number and has given us the idea of time and opened the path to inquiry into the nature of the universe. These pursuits have given us philosophy, a gift from the gods to the mortal race whose value neither has been nor ever will be surpassed. I’m quite prepared to declare this to be the supreme good our eyesight offers us.”

So eyesight is not a good in itself, but merely a stepping-stone toward philosophy. It’s a kind of necessary evil, like the intestines. It would be better if we didn’t have to eat at all, but given that we live in this feeble physical world, the best we can do is to make the food take a long time to go through us so we have as much time as possible to think in between meals.

In the same way, ideally, we wouldn’t need eyesight. Ideally, we would do pure philosophy, which transcends feeble physical reality. But we are stuck in physical form and with imperfect minds. So we need these support mechanisms to push us toward philosophy. Eyesight leads to astronomy which leads to mathematics and thus philosophy, and then we’re in business.

It would have been better if we could have skipped those preliminary steps and gone straight to philosophy. Then eyesight would have been redundant. Eyesight isn’t actually needed for true philosophy. We only need it because of our imperfections. We need this little push to get us started on philosophy, but once we’re up and running with philosophy we can pretty much poke our eyes out because they’re not needed anymore.

In this passage, Plato was talking about astronomy but he could just as well have said the same thing for geometry. This is how we must think about the role of geometrical diagrams and sensory perceptions in Plato’s philosophy of mathematics. True mathematics is independent of all that physical stuff, according to Plato. Geometry is not based on physical and sensory experiences with moving figures, drawing lines, and so on, as Aristotle claimed. Diagrams and reliance on the senses are only a stepping stone to true geometry. We need this crutch because our minds and bodies are feeble and imperfect. But once we’ve reached the philosophical level of doing geometry, we can kick away this ladder because then it serves no purpose anymore.

Here’s another colorful image Plato has for this. He’s explaining why birds exist. “[Birds] descended from simpleminded men––men who studied the heavenly bodies but in their naiveté believed that the most reliable proofs concerning them could be based upon visual observation.” And conversely, “land animals came from men who had no tincture of philosophy and who made no study of the heavens whatsoever. As a consequence they carried their forelimbs and their heads dragging toward the ground.”

So the philosophising human is the perfect balance between these poles: not focused on worldly gratification like the beasts, but also not making the mistake of trying to understand thing by looking. The birds thought that the best way to understand the stars was to get as close as possible to get a good look. But humans know better. We understand that the best way to understand the stars is by thinking, by philosophising, not looking.

Once again the same can be said for geometry. Too much looking and not enough thinking: that is the cardinal sin that we must avoid not only in astronomy but in geometry as well.

This also fits well with another work by Plato, the Meno. In this work, Plato shows how an ordinary uneducated slave boy can be led to recognize geometric truths, such as a special case of the Pythagorean Theorem. Socrates draws a simple diagram and asks some simple questions, and step by step the boy fills in the reasoning and arrives at the theorem.

Plato interprets this as a sort of awakening. Learning is a form of recollection, he claims. That is to say, the boy did not reach this geometric insight through instruction, or through empirical investigation dependent on the senses. Rather, the boy realized that he knew something that he didn’t know that he knew, so to speak. His inner philosopher was awakened. External input was the trigger for this awakening, but the knowledge had really been there all along. The senses are just a trigger for reawakening this knowledge, not an actual basis for that knowledge. This story sums up the role of the senses in geometry, according to Plato.

So what does this mean for the axioms of Euclid? What kinds of things do the axioms of geometry need to be to conform with Plato’s vision of geometry as this kind of pure philosophy, a work purely of the mind? I think it comes down to a kind of innateness theory of axioms. The axioms of geometry need to be essentially pre-programmed into our minds.

This fits with the idea that learning is recollection, and that mathematics is merely making the mind conscious of things it didn’t know that it knew. There is no external source of this knowledge, according to Plato. The mind just knows it, within itself.

So axioms should be intuitive, instinctive. You should read them and you should go: of course! They should feel like the most natural and undoubtable thing in the world. That’s what Plato’s theory suggests.

Proclus of course agrees. He’s Plato’s mouthpiece, and here’s what he says about axioms: “axioms take for granted things that are immediately evident to our knowledge and easily grasped by our untaught understanding”; “[axioms] must always be superior to their consequences in being simpler, indemonstrable, and evident in themselves.”

That’s almost exactly what Aristotle said. So Plato and Aristotle arrive at the same view of axioms despite their very different outlooks. They disagree on the ultimate origin and foundation of this knowledge: whether it comes from sensory experience and the external world, or whether it comes purely from within our philosophical faculties.

This opposition is famously captured in the the iconic fresco The School of Athens painted by Raphael. Plato is pointing to the sky, Aristotle is pointing straight ahead. They are basically pointing to where they think knowledge comes from. Aristotle thinks the source of knowledge is the world before our noses. Plato thinks knowledge resides in a higher realm, above the physical.

But despite this orthogonal disagreement, Plato and Aristotle agree on the properties that axioms must have. Axioms need to be the simplest and most obvious first truths.

Do you agree with them? No, you don’t. You don’t think axioms need to be obvious and intuitive. Either that, or else you think Newtonian physics is a hoax.

Newtonian physics is an example of an axiomatic theory where the axioms are completely non-intuitive. In fact, they are very strongly counter-intuitive. The basic axiom of Newtonian physics is the law of universal gravitation. Any rock is pulling on any other rock, even if they are separated by thousands of miles of empty space. That’s just sheer witchcraft. In fact, you yourself is in a direct bond with all the universe through this mysterious force. It’s like something straight out of science fiction or new age spirituality. Every last one of the thousand stars in the night sky is actively and directly exerting a force on you at any given moment. That’s crazier than any occult astrology you’ve ever heard. Yet that’s Newtonian physics, the most successful scientific theory of all time.

In fact, this example of Newtonian physics corresponds precisely to a kind of blind spot that we should have seen coming in our discussion of axiomatic philosophy. On the one hand we said axioms should be obvious, simple truths, but on the other hand we said axioms are what you are left with after you start with theorems like the Pythagorean Theorem and reduce and reduce and reduce.

Those are two different ideals. And they are not necessarily compatible. The idea of reducing complex theorems into smaller part does not entail that the axioms you end up with are obvious truths. Axioms are just whatever results when you reduce many theorems to a few core principles. This process could be seen as agnostic as to the nature of the axioms. We just follow the reductive process where it takes us. Just like a chemist cannot decide in advance what kinds of elements he wants the period table to contain, so also the mathematician reducing geometry to its building blocks has to keep and open mind and follow the reductive process where it takes him.

At least that’s how Newton interpreted the geometrical method. He’s very clear about this. He’s very explicit about this reductive process being the same in physics as in geometry. Geometry starts with things like the Pythagorean Theorem; physics starts with things like the speeds of the planets and so on. These are the “phenomena” as Newton calls them. And from the phenomena you reason backwards to the underlying causes or unifying principles. That’s what you do in geometry when you show how many theorems can be reduced to a few key principles, and that’s what you do in physics when you show that lots of astronomical data can be derived from a few laws.

Newton is adamant that these two things are the same. “As in mathematics, so in natural philosophy,” he says. “Natural philosophy” means physics. The two are the same, in terms of methodology. That’s how Newton justifies his radical physics. By saying that it’s nothing but what the geometers had been doing all along.

To make this shoe fit, Newton has to sacrifice the idea that axioms are obvious truths, as Aristotle and Plato had claimed. But his interpretation is not crazy. You could read Euclid that way. You could say: Euclid doesn’t care whether the axioms are obvious or not. He just follows the reductive process where it leads. He’s agnostic or open-minded about what kinds of axioms will be the outcome of this process.

Of course that clashes with what Plato and Aristotle said, but they are philosophers so it doesn’t really matter. The important thing is what the mathematicians thought, and their texts are ambiguous enough to allow for the possibility of Newton’s interpretation.

So Newton interprets Euclid a certain way in order to justify his own methodology. Newton’s interpretation is hardly very likely, but it’s also not provably wrong exactly. He’s a clever guy, Newton. He knows his physics is crazy and occult, so he massages an interpretation of the Euclidean tradition to legitimate it.

I don’t think Newton was right in the way he interpreted Euclid. But his perspective is very illuminating nonetheless. For one thing it’s striking that Euclid’s geometry was so authoritative still 2000 years after it was written that cutting-edge modern science was justified on the grounds that its method was the same as that of Euclid. There was no more solid pillar of respectability than Euclid, to anchor your theory to. Even then, 2000 years after the Elements was written. Euclid’s city, Alexandria, had burned any number of times, and seen several new religions come and go. But the impact of the geometrical method was above such transient circumstances.

But even just for understanding Greek philosophy of geometry in itself the Newtonian example is useful. Greek philosophers seem to have been blissfully unaware of the possibility of such a theory, where the reductive process leads to non-obvious axioms.

In fact, in Aristotle’s Posterior Analytics there is a phrase that pretty much sums this up. Here’s what Aristotle says: ”I call the same things principles and primitives.” Principles are the logical starting points of a deductive system, and primitives are the immediately given truths grounded in perception. Aristotle thinks you might as well regard these as synonyms, apparently. He does not serious consider the possibility of viable scientific theory in which these two concepts would not align.

But Newtonian physics is such a theory. It has principles that are not primitives. That is to say, it has axioms obtained by reducing the phenomena down to their smallest parts, but those axioms are not obvious and not intuitive and not known by direct experience.

So the Greeks could have their cake and eat it too. They could have the idea of “reasoning backwards”–of reducing geometry to a few core principles–and at the same time maintain that these core principles should conform to various predetermined philosophical requirements as well, such as being obvious.

Newtonian physics shows that you can’t always have it both ways. At a certain point, you have to pick sides. So you have to decide which of the two you’d rather sacrifice.

Newton picked the brave side, I think. The path less travelled. He sacrificed the idea that axioms should be intuitive. A huge sacrifice, almost unthinkable. It’s like a military general sacrificing 90% of his troops in an audacious manoeuvre. Few people would have dared to even contemplate such a move. But it worked. Even though it was a huge sacrifice, it got Newton into such a strong position that he won the war anyway.

Many people at the time thought Newton was crazy for making this sacrifice. He got a lot of pushback for this. Reduction to non-obvious axioms?! It’s such a radical idea. It goes against everything Plato and Aristotle said.

But in a way Newton’s idea is already contained in Euclid. It’s the idea of reading Euclid backwards. Newton’s perspective may not have been Euclid’s exactly, but it’s useful to keep the example of Newtonian physics in mind to highlight what’s at stake in this tension between the “backwards” and “forwards” directions of reading Euclid.