Method and critique

3 Method and critique

We observed in §1.6 that category theory is adept at capturing the significant ideas in a mathematical discipline in the form of universal properties, whilst axioms may be expressed in symbolic logic as systems of introduction and elimination rules. Then in §2 we demonstrated the formal connection between these.

3.1. On this basis, we propose a methodology for devising a new foundational system for a subject. This is summed up by the diagram

In this, the task of the journeyman mathematician — the deduction of theorems from axioms — is to fill in the upward arrow on the right. Hopefully, this will make the rectangle “commute” in the sense that we recover the old theorems from the new axioms.

However, this job only makes sense in a professional framework in which axioms have already been chosen to capture the intuitions and applications of the discipline. This choice is represented by the downward arrow on the left.

The constructions that we sketched in §§2.8f provide the horizontal arrow in the middle, whilst the computational interpretation (bottom right) depends on the details of the resulting calculus (§2.3).

We shall show how this programme works for the example of general topology in the remaining sections of this paper , with some hints about how to apply the ideas to other subjects.

3.2. The first arrow in some sense derives the axioms from the theorems.

Euclid and Bourbaki gave us a style of presenting mathematics that begins from axioms which (so far as their text is concerned) come out of the blue, and deduces theorems from them as if the results were obvious and inevitable. Abel described Gauss as “like a fox, who effaces his tracks in the sand with his tail”. Consequently, lesser mathematicians think that it is enough to state a bunch of axioms in order to justify the relevance of their results, whilst students learn nothing about the ways in which their masters discovered their theorems.

In following our methodology, we cannot “write down all the axioms first”, because finding the right axioms is the goal of the investigation. Being able to deduce the old theorems from the new axioms is an experimental test of the axioms that we propose. So before we have discovered the final version of the axiom system, we need to conduct preliminary experiments in ad hoc frameworks (cf. §5).

Along the way, like any engineer, we shall assemble several prototypes. This means that a certain amount of repetition of tests in needed. Of course this happens all the time in mathematics too [Lak63] — it is just the textbooks that falsify history by saying otherwise. At the end of the day, we want to state our chosen axioms and deduce the important theorems from them. But that will indeed be the end of the process, which will then no longer be an active piece of foundational research.

For a modern study of a wide range of techniques for discovering the principles of several mathematical disciplines, see [Cor03].

3.3. Counterexamples play a major part in the empirical development of mathematics. Unfortunately, they are sometimes given such a degree of prominence that they can stifle subsequent progress.

Typically, the argument behind a counterexample may prove

D, E, F ⊢ ¬ G,

entirely rigorously. Then the reader is expected to agree that,

therefore, E is false,

but why? D, F and G had exactly the same status in the proof, so why pick on E?

This is an argument by authority, not a valid piece of mathematics: the author has simply used sleight-of-hand to make us accept his prejudices. These become embedded in the literature and treated as if they were theorems, on which whole theories are built. Having forgotten that the counterexample depended on cultural assumptions, people subsequently resist the introduction of a new paradigm.

This abuse of counterexamples is very common, but the one that has caused the greatest intellectual paralysis is surely Georg Cantor’s, that no set is in bijection with its powerset. On this he built his theory of cardinality, which is far too coarse to be of any use in mainstream mathematics. He said of the discovery that ℝ²≅ℝ, “I see it but I don’t believe it”, but this did not deter him from jumping into the abyss.

Because of Cantor’s theorem, a set X can only satisfy X≅ X^X in the trivial cases X≡∅ or 1. Apparently, the untyped λ-calculus is therefore inconsistent. Dana Scott eventually saw through this, and constructed topological lattices with this property [Sco93]. As these can be embedded in presheaf toposes, where we may treat them as intuitionistic sets, we see that Cantor’s supposedly fundamental theorem of set theory actually relies on excluded middle.

The message that we ought to take from a counterexample is no more than that you can’t do it like that. Maybe someone someday will think of a different way. They will identify underlying assumptions A, B and C that had previously been overlooked, and achieve all of D, E, F and G on some other basis, using A′, B′ and C′ instead.

In particular, just as in the use of formal methods for security (putting a strong lock on a weak door), it is the very rigour of a fragment of argument that can be most misleading about the bigger picture. In our architectural metaphor, counterexamples are reasons for choosing one general scheme over another, whilst theorems hold the building up once we have chosen which plan to carry out. A tower block may later turn out to be a mistake for sociological reasons that have nothing to do with how accurately its plans were drawn.

It would, perhaps, be useful to make a clear distinction between the principal track of a theory (from axioms to theorems) and its peri-theory. This is the discussion around the theory, consisting of the main examples, prototypes and counterexamples that led us to the choice of statements of the axioms and theorems. Other examples of peritheory are converses that lead back from the main theorem to (the necessity of) the chosen axioms, and more generally the lists of equivalent alternative formulations of the axioms that some textbooks like to give.

The whole of this paper (apart from §§6 & 8) concerns the peritheory of topology,

3.4. Unfortunately, it is not as straightforward as we have suggested to work from a client’s brief: this cannot be taken literally, and has to be negotiated.

If the client is allowed to dictate what is fundamental, the new foundations will be no more flexible than the old ones. It takes an outsider’s perspective to distinguish the key elements of an enterprise from the accumulation of detail. The first arrow does not therefore just copy the theorems and chapter headings of the old theory.

In approaching a mathematical discipline, category theory often focuses on ideas that its specialists previously regarded as trivial. For example, the subobject classifier and Sierpiński space (§7) are merely two-element sets in classical set theory and topology. Also, whilst mathematicians have acknowledged that the theorems that create important structures are adjoints, an adjunction is a two-way relationship. In this, the other partner typically does something that appears to be mere bureaucracy, namely to “forget” the same structure.

3.5. Another issue on which the clients’ preconceptions need to be challenged is the identification of their most valuable products. It is up to their customers to do this. For example, set theorists put much of their effort into considering infinite cardinals, but the demand from the market is for more prosaic things, such as quotients of equivalence relations. These are axiomatised explicitly in categorical logic(§5).

On the other hand, maybe led by an over-emphasised counterexample, specialists sometimes develop variant forms of a subject that omit properties which the customers may regard as essential. One case of this is compactness of the interval [0,1]⊂ℝ (§1.8), which is absent from Bishop’s theory [BB85].

3.6. Negotiation of this kind is how we answer the final objection to changing foundations, namely that the prevailing set-up is crucial for the whole of science (§4).

Science and mathematics are systems whose component disciplines interact across an interface. The higher-level components depend only weakly on the details of those below, but do have requirements for certain features. This is a commonplace in modern technology. It is also well known in the wider picture of science: whilst chemistry depends on physics, it only uses four sub-atomic particles, organic chemistry for the most part only four elements, and genetics only four bases and twenty amino acids. Physics very probably relies on compactness of the interval, for example to ensure that local patches of solutions to a differential equation fit together into a global solution, instead of forming a singular cover (§1.8). On the other hand, I would be very sceptical if you told me that some property of black holes depends on excluded middle. Have you actually developed the analogous constructive theory, and found observational evidence to distinguish it from the classical one? This is, after all, what the experimental method says you should do.

However, the question of whether some application in science actually depends on a particular foundational principle is never an easy one. Settling the necessity of each invocation of excluded middle, Heine–Borel or a large cardinal in the chain of theorems about functional analysis or other subjects may require a decade’s research. Unfortunately, the commonest way of handling this is “megaphone philosophy” that, if it contains any mathematics at all, consists largely of the abuse of counterexamples.

3.7. When we replace a component of a system, it needs to be backwards-compatible in its function, not necessarily its implementation or extent. After all, an architect who has been commissioned to replace a building will only do so with an exact copy if it is to be a museum.

In the case of general topology, the methods of construction in this paper are such that the new building must either be smaller than the old one (consisting of just locally compact spaces) or substantially bigger. This is entirely consistent with the historical development of the subject, which grew from figures that are embedded in Euclidean 3-dimensional space, to ℝⁿ, to projective and non-Euclidean geometry, to manifolds, to spaces of functions in analysis, and to domains for denotational semantics of programming languages. The intuition of continuity has been captured in numerous quite different ways, making use of metrics, uniformity and convergence of sequences, as well as systems of points and open neighbourhoods. Each axiomatisation leads to a different totality of domains of continuity.

For comparison, there can be little doubt that groups and fields have the right axioms for the intuitions that they seek to capture. This is indicated by the fact that we use the axioms directly for computation. Algebra textbooks are also able to classify finite fields, and they make a serious attack on the similar (albeit intractible) problem for groups.

The situation in general topology is much less clear. It is rather like a medieval “world” chart that more or less accurately depicts the Mediterranean, but has mythological creatures around the outside. The various approaches to continuity accurately capture real manifolds, just as the old cartographers recorded their own familiar territory, but we cannot be confident in using the outer parts of the chart.

The error in both cases is the co-ordinate system. Its assumptions provide reasonable approximations locally, but by their nature entail certain boundaries to the global system. Flat charts are useful, but in ancient times Eratosthenes had known, not only that the Earth is spherical (from lunar eclipses), but even how big it is. Similarly, it has been known since the 1960s that points and open sets are the wrong co-ordinate system for topology. Sheaves in algebraic geometry were based on open sets and not points , whilst algebraic topologists sought more “convenient” categories [Bro64, Ste67], i.e. those that admit general spaces of functions. However, it is by no means clear what topologies these function-spaces should carry, especially if we want to invesigate the properties of ℕ^{ℕ^ℕ} and ℝ^{ℝ^ℝ}(cf. §12.10).

There is, therefore, nothing special about the boundaries of the category of objects that the textbooks call “topological spaces”. These books treat non-Hausdorff spaces with derision, and make little attempt to explore the full extent of even the world that is measured out by their own co-ordinate system. This was only begun when the analogy with the ∃∧-fragment of logic was recognised, cf. [Vic88]. We therefore undefine the terms “space” and “topological space”, leaving them open to new definitions. The textbook spaces will be re-branded as Bourbaki spaces [Bou66] to strip them of their authority.

3.8. Even when we have found the right co-ordinate system, it may not be appropriate to describe it in the same way for both the foundations and the applications. Any engineer knows that the user manual for a gadget should not be written in the same way as its technical specification. So this paper discusses the foundations of ASD, whilst [I][J] provide separate introductions that are suitable for its applications to elementary real analysis.

One reason for this is that the technical ideas may later be redeployed to make something else with an entirely different use. In our case, we shall find that there is a new underlying abstract structure (§§6,12) that has other applications besides topology, and deserves to be studied in its own right. It includes certain definitions that, in the presence of the specifically topological structure, have different characterisations (cf. §7.7). This is in line with the usual experience of applying category theory to mathematical disciplines, namely that ideas with various different traditional formulations turn out to be examples of a common abstract idea.

Secondly, the demands from users are often the driving forces behind advances in technology. However, as we have seen for computing hardware and software, the biggest improvements do not result from adding bells and whistles, but by re-thinking and strengthening the fundamental principles. In ASD, the principal challenge for the future is how to extend the boundaries beyond local compactness, in such a way that Banach spaces are given “the right” topology, whatever that means. In order to do this, it is the initial formulation of the underlying abstract structure in §6, rather than its adaptation to topology, that needs to be replaced.

3.9. Finally, there is an issue for which architecture is entirely the wrong metaphor and the departure of Columbus from the Mediterranean is much more appropriate. Los Reyes Catolicos certainly did not promote free intellectual exploration in their domestic and colonial policy, but they did at least fund a “blue sea” project.

Nowadays, one is asked to give advance notice of all of the theorems that one intends to prove. Such planning is appropriate when building a house, but it is possible if and only if there are no original ideas. A mathematician with a plan for a theorem wants to carry it out straight away, and the only pieces of equipment that are needed are a clear head and a clear blackboard. We don’t put our lives at stake as Columbus did when we embark on scientific experiments or try to prove mathematical theorems, but if there is no intellectual risk of failure in a proposed piece of research, then it is redundant, and probably not worthy of funding.

We like to think that the finished product of mathematics is the most precise of any branch of science or engineering. The corollary of this is that the vision of a mathematical project in advance of its detailed plan is necessarily much more vague than in any other discipline.

And things may not go according to plan even if we do succeed, because there may be a new continent to discover. For a powerful account that has a far wider relevance than to physics, see Part IV of [Smo06], which demonstrates a familiarity with the real experience of those who do revolutionary science that is absent from [Kuh62].

It is time to apply this method to some mathematics.