Practical Foundations of Mathematics

Paul Taylor

1.8  Classical and Intuitionistic Logic

Mathematical reasoning as commonly practised makes use of two other logical principles, excluded middle and the axiom of choice, which we shall not use in our main development. Classical predicate calculus consists of the rules we have given together with excluded middle.

Excluded middle   ``Every proposition is either true or false.''

DEFINITION 1.8.1 Excluded middle may be expressed as either of

omitted prooftree environment        omitted prooftree environment

which are equivalent in the sense that

æ
è
"a.\lnot \lnot aÞ a ö
ø
\dashv \vdash æ
è
"b.bÚ\lnot b ö
ø
.
However, an individual formula may be \lnot \lnot -closed (satisfy the rule on the left) without being decidable or complemented (as on the right).

In saying that we shall not use excluded middle, beware that we are not affirming its negation, \lnot (fÚ\lnot f) º \lnot fÙ\lnot \lnot f, which is falsity. If we are able to prove neither f nor \lnot f then we remain silent about them.

Of course there are instances of case analysis even in intuitionism, in particular the properties of finite sets. A recurrent example will be parsing of terms in free algebras, for example a list either is empty or has a head (first element) and tail (Section 2.7).

Idiomatically, (\lnot Á ) and (ÚE ) are incorporated to give indirect rules

.3 omitted proofbox environment        10pt.4omitted proofbox environment
known as reductio ad absurdum and tertium non datur (Latin: the third is not given) respectively. Some Real Mathematicians use the former habitually, starting every proof with ``suppose not'' (``are you calling me a liar?'') without even attempting a positive approach. Many treatments of logic itself (rather than difficult applications) regrettably do not prove things, but refute their negations instead.

Certain subjects, notably linear algebra, really have no idea of negation:

EXAMPLE 1.8.2 Let M be an invertible matrix. To show that Mx = u has only one solution, it is quite unnecessary to assume that a ¹ b are different solutions, as the proof naturally leads to a = b without the aid of the hypothesis. It is futile to obtain a contradiction to the hypothesis, ie \lnot \lnot(a = b), and then deduce a = b after all.

See Remarks 2.5.7 and 3.7.12 for how this arises in induction.

THEOREM 1.8.3 Negation is a duality in classical logic, interchanging T with ^, Ù with Ú and " with $. That is,

-1pt omitted array environment
The equations on the top row are valid without excluded middle. Those in the left column are called de Morgan's laws, although they were well known to William of Ockham. Excluded middle also makes any implication aÞ b equivalent to its contrapositive, \lnot bÞ \lnot a. []

REMARK 1.8.4 In classical propositional calculus with n propositional variables (and no quantifiers), we may enumerate the 2n cases where each of them is true or false. Such a listing of the values of a formula is called a truth table. Each line of the table which has the value T may be read as a conjunction of possibly negated atomic propositions, and the whole table as the disjunction of these lines; this is the disjunctive normal form and is classically equivalent to the given formula. So if two formulae have the same truth table they are classically inter-provable. In other words, this calculus is complete: anything which is true in all models ( ie all 2n cases) is provable.

omitted array environment 9 = \vtop \special pn 8 \special ar 400 400 400 400 0 6.28319 \wd 9 f\wd 9 \copy 9 \wd 9 fÙy\wd 9 \copy 9 \wd 9 y
Venn diagrams, in which overlapping circles represent propositional variables, are a popular way of illustrating classical propositional logic, although they are misleading when one of the regions turns out to be empty. They were invented in 1764 by Johann Lambert, better known for his work on light intensity; he also proved the irrationality of p and introduced the hyperbolic functions sinh, cosh, etc .

The truth table approach to logic is pedagogically not so simple as is claimed, as the material implication (^Þ f) = T is an obstacle right at the start, which has confused every generation of students since ancient times. It took deeper insight into logic to discover the analogy between (Þ Á ) and defining functions or sub-routines (Section 2.4), but once pointed out it is completely natural, and the (Þ Á )-rule is easily grasped. This analogy is a pearl of modern logic, and I believe students should be allowed to glimpse it, rather than have the prejudices of classical logic reinforced. The material implication does feature in proof theory as the (^E )-rule, but we have already called this the joker of logic (Remark 1.4.4).

The Sheffer stroke   Whereas in art and in other parts of mathematics symmetry is considered beautiful, many logic texts use the de Morgan duality to eradicate half of the calculus in a quite arbitrary way. Instead of presenting " and Ù with their own natural properties, they are treated as mere abbreviations for \lnot $\lnot and \lnot (\lnot aÚ\lnot b), or vice versa . In our intuition Ù and Ú are twins, as are " and $, so why should one of them be treated as a second class citizen in the logical world?

Gottfried Ploucquet (1764) discovered that a single binary connective actually suffices; the operation a\lnand b º \lnot (aÙb) is commonly known as the Sheffer stroke. Using "x.\lnot(f[x]Ùy[x]), Moses Schönfinkel was able to dispose of variables and quantifiers, reducing the whole logical calculus to just one symbol (plus brackets). His paper survives as an important part of the literature because his combinators have types which correspond to the structural rules (Example 2.4.2 and Exercise  2.26ff).

These operations ``simplify'' logic in the same way that it would simplify chemistry if (after the discovery of oxygen, carbon, hydrogen, etc ) sea water had been chosen as the one primitive substance (``element''), on the grounds that the usual 93 can all be obtained from it.

Although this nihilist tendency contributed to the failure of mainstream logic to observe the propositions as types analogy (which is very clear in intuitionism), or to recognise the quantifiers as adjoints, the Sheffer stroke is the building block of digital electronics, where it is called nand.

REMARK 1.8.5 When a small current is passed between the base and the emitter of a transistor, a larger current flows from the collector to the emitter. This effect is used in analogue circuits for amplification, but it performs an essentially negating digital operation: a high voltage on the base (relative to the emitter) causes a low voltage on the collector and vice versa .

omitted diagram environment        omitted diagram environment

The example shows nor, \lnot (fÚy), which may be used in a similar way to nand, so positive operators are made by concatenating such circuits.

Intuitionism   Excluded middle seems obvious, until you think about the reason for believing it, and see that this begs the question. It requires mathematicians to be omniscient, a power no other scientist would claim. Those who feel obliged to justify this insist on classifying structures up to isomorphism before using them to solve real problems.

William of Ockham ( c. 1320), in whose name Henry Sheffer wielded his stroke, considered propositions about the future with an ``indeterminate'' value which even God does not yet know to be either true or false, in connection with the problem of the Free Will of potential sinners. Even Aristotle had his doubts about excluded middle, which is more properly attributed to the Stoics, according to Jan Lukasiewicz.

REMARK 1.8.6 The modern critique of classical omniscience in analysis was formulated by Jan Brouwer (1907). Given a sequence whose ultimate behaviour is an unsolved problem (for which he took questions about patterns of digits in the decimal expansion of p), he constructed counterexamples to several of the major assumptions and theorems of analysis. Suppose for example that all known terms of (\argan) are zero. Although the real number u = ån \argan2-n is well defined, it is not known whether u = 0 or u > 0, so we cannot find even the first digit of 1-u in the usual decimal expansion. He showed that continuous functions need not be Lebesgue-integrable, or have local maxima on [0,1]. On the other hand, all intuitionistically definable functions R® R are continuous.

With hindsight, it is unfortunate that Brouwer chose analysis to attack. As we shall demonstrate throughout this book, logic and algebra can be presented intuitionistically, with no noticeable inconvenience, whereas most of the usual properties of the real numbers rely on excluded middle. For this reason we will not attempt to cover constructive analysis, but Errett Bishop [BB85] gave an account which is very much in our spirit: it makes such alterations to the definitions as are required, and gets on with proving the traditional theorems as well as can be done, discarding naive formulations rather than dwelling on their counterexamples.

Every reform provokes reaction. Although Hilbert's influence is to be found behind almost every pre-war revolution in logic, he also made such comments as ``no one shall drive us from the paradise that Cantor created for us'' and his Platonist battle-cry, ``Wir müssen wissen, wir werden wissen'' (we must know, we shall know). Later he claimed that ``to prohibit existence statements and the principle of excluded middle is tantamount to relinquishing the science of mathematics altogether.''

Rather more of mathematics than Hilbert's ``wretched remnants'' has now been developed intuitionistically. Often, however, we shall find it convenient to assume excluded middle in order to give some simple introductory examples of concepts, particularly when the most familiar form occurs in the context of R. There are also habits of language (``N has no proper subalgebra'') which one is reluctant to give up: they are to be understood as idioms. The reader for whom constructivity is essential will be able to recognise the cases where these conventions apply.

Andrei Kolmogorov (1925) devised a translation of classical proofs into intuitionistic ones, which is often attributed to Kurt Gödel.

FACT 1.8.7 Define f\lnot \lnot by structural recursion on predicates as follows:

omitted eqnarray* environment

If G\vdash f is provable classically, then G\lnot \lnot\vdash f\lnot \lnot has an intuitionistic proof, ie without using excluded middle. In particular, intuitionistic logic does not save us from any inconsistency (the ability to prove ^) which might arise classically. []

The axiom of choice   The increasingly abstract form of late nineteenth century mathematics led to the use of infinite families of choices, often with no conscious understanding that a new logical principle was involved. Giuseppe Peano did formulate, and reject, such an axiom in 1890, and Charles Sanders Peirce gave the following definition in 1893, but it was Ernst Zermelo who first recognised how widely it had already been used. His 1904 proof of the well-ordering principle (Proposition 6.7.13 and Exercise  6.53) attracted vehement opposition - at least, judged by the contemporary standards of courtesy, so much higher than today. It was in order to formulate his response that he found his famous axioms of set theory, which we shall discuss in Remark 2.2.9.

DEFINITION 1.8.8 The axiom of choice says that

any entire relation R:X\leftharpoondown \rightharpoonup Y (Definition 1.3.1(c)) contains a total functional relation f:X® Y, ie "x.xR® f(x).

Exercises 1.38 and 2.15 were how Burali-Forti and Zermelo formulated it; the former is more convenient for category theory, and Radu Diaconescu showed that it implies excluded middle (1975, Exercise 2.16). Russell and Whitehead used a ``multiplicative axiom,'' but this is only meaningful in the context of a much stronger principle (Proposition 9.6.13). Well-ordering was later supplanted in algebra by maximality properties such as Zorn's Lemma (Exercise 3.16), actually due to Kazimierz Kuratowski.

In the first use of Zermelo's axioms, Georg Hamel showed that R has a basis as a vector space over Q. Tychonov's theorem (that a product of compact spaces is compact) and many other famous results have also been shown to be equivalent to it.

Not all of the consequences of Choice are benign, for instance it allows us to define non-measurable sets, with the bizarre corollary (due to Felix Hausdorff) that a sphere can be decomposed into two or more spheres, each congruent to the first. The moral of this is that if we allow the Angels to employ brute force, then the Devil will make use of it too. Even Zermelo and many of the enthusiasts for Choice considered it appropriate to indicate when results depend on it.

When Hilbert gave his basis theorem using Choice, Paul Gordon (Emmy Noether's thesis adviser) said of it, ``Das ist nicht Mathematik, das ist Theologie,'' having worked on the subject for twenty years using what we would now call constructive mathematics. Although we, on the cusp of the millennium, now reject Choice, it was the way forward at the start of the twentieth century: it stimulated research throughout mathematics, notably in the Polish school, which we have to thank for numerous ideas in logic and general topology mentioned in this book [Moo82, McC67].

Zermelo conjectured that Choice was independent of his other axioms, and Abraham Fraenkel devised models with permutable ur-elements in order to prove this. Kurt Gödel (1938) showed how to cut down a model of Zermelo's axioms to the ``constructible sets,'' for which Choice is provable. However, it was 1963 when Paul Cohen found a model of Zermelo's other axioms in which Choice fails.

The axiom of choice is typically not needed in the concrete cases, because their own structure provides some way of making the selections (we shall indicate real uses of Choice by the capital letter). Often it is used to extend a property of some familiar structures to the generality of an abstract axiomatisation, but even then the need for Choice may be more a feature of that particular formulation than of the actual mathematical structures. For example, Peter Johnstone [Joh82] showed that, by a conceptual change from points to open sets, Tychonov's Theorem could be proved without Choice. It is for infinitary algebra in Sections 5.6 and 6.2 that this axiom will be most missed in this book. We respond to this difficulty by examining what infinitary operations are of interest in practice.

In the countable case the following assumption, formulated by Paul Bernays, is often more directly applicable.

DEFINITION 1.8.9 The axiom of dependent choice says that

any entire relation XR\leftharpoondown \rightharpoonup X with an element x0 Î X contains an (w-)sequence, ie a function x(-):N® X, such that "n.xnR® xn+1.

If R is a function then this is primitive recursion over N (Remark  2.7.7). Similarly, we get Dependent Choice by repeatedly applying the choice function to the seed. König's Lemma (Corollary 2.5.10) is a widely used form of Dependent Choice throughout informatics and combinatorics. Dependent Choice does not imply excluded middle or vice versa .

Logic in a topos   For Jan Brouwer and his student Arend Heyting, Intuitionism was a profound philosophy of mathematics [Hey56, Dum77,Man98], but like increasingly many logicians, we shall use the word intuitionistic simply to mean that we do not use excluded middle. This abstinence is nowadays very important in category theory, not because of any philosophical conviction on the part of categorists (indeed most of them still use excluded middle as readily as the Real Mathematician does), but because it is the internal logic of the kind of world (a topos) which most naturally axiomatises the familiar mathematical universe.

Joachim Lambek and Philip Scott [LS86] show that the so-called ``term model'' of the language of mathematics, also known as the ``free topos,'' may be viewed as a preferred world, but Gödel's incompleteness theorem shows that the term model of classical mathematics won't do.

Category theory provides the technology for creating new worlds. This is quite simple so long as we do not require them to be classical. Why should we want such worlds, though? One application is to provide the generic objects ( cf those used in proof boxes) in such a way that we may reason with them in the ordinary way in their own worlds, and then instantiate them. Arguments about ``generic points'' have been used in geometry since Giuseppe Veronese (1891), but the logic is unsound if, for example, the equality of two such points is required to be decidable.

Worlds have also been created with convenient but exotic properties. In synthetic differential geometry [Koc81] all functions on R are continuous, as Jan Brouwer said, and it is legitimate to use infinitesimals in the differential calculus. More recently, synthetic domain theory has similarly postulated that all functions are to be computable.

Excluded middle was traditionally regarded as a true fact about the real world, so in order to investigate intuitionistic logic it was necessary to build fictional worlds where excluded middle does not hold. The point of view of this book is that these worlds are not exotic but quite normal, and their logic is perfectly typical, just as algebraic extensions of the rationals have come to be seen as ordinary number domains with straightforward arithmetic structure.

It is not necessary to know in advance how to construct such worlds from classical ones before learning how to reason in them. Indeed excluded middle is like the fear of water: it's easier to learn to swim as a small child, before anyone's told you that it's difficult.

Whatever your philosophical standpoint may be, intuitionism forces you to write mathematics much more cleanly, and to understand much more deeply how it works. Proof by refutation runs backwards, and so the argument gets tangled. The constructive character of intuitionism is really due to the strong analogy with type theory, which does not extend to excluded middle. We describe it in Section 2.4, making use of another idea due to Kolmogorov.