Practical Foundations of Mathematics

Paul Taylor

6.7 The Ordinals

The ordinals admit a peculiar kind of arithmetic, into which it is often possible to make a crude translation of syntax. This provides a simple but powerful way of finding loop measures and so justifying termination or induction principles.

It was already known before Georg Cantor that a real-valued function could have a Fourier representation even if it had finitely many points of discontinuity . He saw that this could be extended to functions which were discontinuous on a set \typeX₀ such as {0}È{1n|n Î N} with an accumulation point \typeX₁ = {0}. Repeating the argument, \typeX₁ could have a set \typeX₂ of accumulation points and so on. Then in 1870 he realised that the set Ç\typeX_n could also be non-empty, and that a ``transfinite'' sequence of sets \typeX_¥ , \typeX_¥+1, ... could be defined. We might picture ¥, now called w, as a diminishing sequence of matchsticks:

1 = \ordinal 1 2 = \ordinal 2 3 = \ordinal 3 w = \ordinal w w+2 = \ordinal w+2

w2 = \ordinal w2 w3 = \ordinal w3

w²+w2+3 = \ordinal w²+w2+3

Although Cantor had developed ordinal arithmetic in 1880, only in 1899 did he formulate a correct definition [Dau79], that an ordinal is an ordered set in which every non-empty subset has a least member: ``there's a first time for everything.'' In other words, a trichotomous (Definition 3.1.3) well founded relation (Proposition 2.5.6).

The easiest way to tell these countable infinities apart is to work backwards, ie find out what descending sub-sequences there are. Every such sub-sequence must eventually stop: that's well-foundedness (Proposition 2.5.9). In the longer ones, there is more opportunity to dawdle down, but from time to time we must leap from the heights of a limit ordinal, landing somewhere that is infinitely far below.

The relation on an ordinal is easily seen to be transitive and extensional (Example 6.3.3), so it is useful to identify each ordinal with its set of predecessors, cf 5 = {0,1,2,3,4} and Remark 6.3.5.

REMARK 6.7.1 The operations on ordinals are as follows:

(a): Zero (the empty string of matchsticks) is an ordinal.
(b): If a is an ordinal then so is succa, its successor, also written a⁺. We embed a Ì succa as an initial segment, and use a to name the extra matchstick on the right. Under the identification of each b Î a with {g Î a|g \prec b}, the elements b and a of succ a are subsets of a, so succa may alternatively be defined to consist of the initial segments of a. These definitions are equivalent classically, but not intuitionistically, and we shall be ambiguous for the moment as to which is intended.
(c): If a_i for i Î I are ordinals then so is È_{i Î I} a_i, this union being taken with respect to the inclusions of initial segments.

Successor and directed union preserve Cantor's definition, but comparing two ordinals (or directed unions) given separately relies on excluded middle: if aÇb ¹ a then b is the least element of a\(aÇb).

Between any two ordinals there is a reflexive relation ( Ì ), whether one is an initial segment of the other. On the other hand, the identification between systems and individuals gives rise to an (irreflexive) well founded relation, written as Î . They are related by

omitted eqnarray* environment

Classically, any ordinal is either zero, or a successor or a limit (union). These cases may be distinguished by trying to define the predecessor,

pred(a) =

a = {g| $b.g Î b Î a} Ì a.

Then pred(succ(a)) = a, and either pred(a) Î a if a is a successor ordinal, or pred(a) = a if it is a limit.

LEMMA 6.7.2 For any ordinal, a = È{succb|b Î a}. []

Transfinite recursion Since ordinals are a special case of well founded relations, for any \ev_Q:P(Q)® Q the equation

p(a) = \ev_Q ({p(b)|b Î a})

has a unique solution by the General Recursion Theorem 6.3.13.

REMARK 6.7.3 The classical transfinite recursion theorem is based on a three-way case analysis. Instead of \ev_Q , it takes an element x Î Q, and functions \ops_a :Q® Q and \opu_l :Q^l ® Q. (Since the arity of u depends on the ordinal anyway, we are obliged to treat the case where the argument is used in the evaluation phase of the recursion.) Then there is a unique function p with

omitted eqnarray* environment

PROOF: Since u may make use of its arguments in an entirely arbitrary way, the general recursion theorem for free theories must be used, with operations \opr_a :Q^succa\rightharpoonup Q of each arity. For

\opr_(succa)x_b :b Î succ a = \ops_a (x_a ),

the operation \opr_(succa) must be able to select x_a , its ath argument, by its position. Of course \opr₀(Æ) = x and \opr_l = \opu_l . []

This arbitrariness is not possible in the intuitionistic version, but recall that infinitary operations tend to be given by universal properties, join being typical. The three conditions are regarded as equations to be solved or ``boundary conditions'' rather than as a case analysis.

THEOREM 6.7.4 Any system of ordinals obeys the universal property (recursion scheme) for a free partial (Ú,s)-algebra with s monotone. That is, for any complete join-semilattice Q equipped with a monotone unary operation \ops_Q :Q® Q, there is a unique homomorphism p, ie

omitted eqnarray* environment

In particular p(0) = ^ (the empty join), p is monotone in the sense that

b Ì aÞ p(b) £ p(a),

and for any limit ordinal l we still have p(l) = Ú{p(a)|a Î l}.

PROOF: Using Lemma 6.7.2, p must satisfy

p(a) =

{\ops_Q (p(b))|b Î a},

which has a unique solution by Theorem 6.3.13. Then

omitted eqnarray* environment

whence in particular p is monotone. The successor equation,

p(succa) =

{\ops_Q (p(b) )|b Ì a} = \ops_Q (p(a)),

is satisfied since, by monotonicity of p and \ops_Q , \ops_Q (p(a)) is the biggest term in the join. The property for limit ordinals still holds since any such ordinal is closed under successor. []

The seed x need not be ^, as in place of Q we may consider the upper set x¯ Q º {x|x £ x} (Example 3.1.6(f)). The unary operation s may also take an ordinal parameter, in which it must be monotone.

Remark 9.6.16 considers iteration of functors, with filtered colimits in place of directed joins, and using the axiom-scheme of replacement.

Rank It is a striking feature that systems of ordinals share many of the properties of the individuals. (We have already seen this for wff- systems in Remark 6.2.10.) However, Cesare Burali-Forti (1897) showed that the whole system is a proper class: it cannot be an individual. Dimitry Mirimanoff (1917) showed that this and the Russell paradox don't arise in Zermelo set theory (Remark 2.2.9), by measuring the well-foundedness of each definable set. This was later incorporated into Zermelo- Fraenkel set theory as the axiom of foundation.

DEFINITION 6.7.5 For any well founded relation (X,\prec ),

p(t) =

{s(p(u))|u\prec t}

defines the ordinal rank function on X, using the axiom of replacement. Remark 3.7.12 gave the finitary version of

PROPOSITION 6.7.6 Let t Î F be a term for a free theory T. Then the rank of t with respect to the immediate sub-expression relation is the first stage at which t occurs, ie t Î T^p(t)(Æ). []

Arithmetic In practice it is not feasible to calculate the rank of a term or proof exactly, cf the number of iterations needed to complete a loop (Remark 2.5.13ff). Instead of the least upper bound È in the previous definition, we assign to r([(u)\vec]) any value which is larger than those of the sub-expressions [(u)\vec], as in Examples 2.6.3ff. This can be done using ordinal arithmetic. It was to define ordinal exponentiation that Cantor introduced transfinite recursion.

DEFINITION 6.7.7 The recursive definitions for N in Example 2.7.8,

omitted array environment

are extended transfinitely by making each operation preserve inhabited (directed and binary) joins in the second argument b. (Using upper sets, they can instead be considered to preserve all joins.)

LEMMA 6.7.8 By induction on g, one may prove successively that

omitted eqnarray* environment

but

1+w = \ordinal 1 \ordinal w = w 2*w = \ordinal w = w

w = (2+3)*w ¹ 2*w+3*w = w*2 and w³*2 = (w*2)³ ¹ w³*8. []

It is also important to appreciate that a^b is not an exponential in either a cartesian closed category or a Heyting lattice. One way of seeing this is that the latter have left adjoints (product and meet).

REMARK 6.7.9 As these operations preserve unions, they have right adjoints (Theorem 3.6.9), called subtraction, division and logarithm,

omitted diagram environment

where On is the class of ordinals (reflexively) ordered by inclusion. The units and co-units of the adjunctions satisfy

omitted eqnarray* environment

subject to the conditions on the right. Notice that we subtract matchsticks from the left, so (-)-1 is not pred (for example at w+1 ). The second equation, however, depends on excluded middle: 1-f = \lnot f and f+(1-f) = fÚ\lnot f for f Ì 1. Since for 2 Ì a and 1 Ì g we also have

g¸a^{log_a g}\prec a g-(a^b *(g¸a^b ))\prec a^b ,

it is possible (classically) to write any ordinal g uniquely as

a^b_n*d_n + a^b_n-1*d_n-1 +···+ a^b₀*d₀ + a^b₁* d₁,

where b_n\succ b_n-1\succ ···\succ b₂\succ b₁ and d_i\prec a. The b_i may then also be decomposed in the same way. In the case a = w, this hereditary expression is known as the Cantor normal form. []

REMARK 6.7.10 Arithmetic on matchstick pictures may be represented by the lexicographic product (Proposition 2.6.8), and in fact any finite tower w, w^w , w^{w^w} , ... can be encoded as an order on N (or a subset of Q) using primitive recursion. Lemma 6.7.8 may be extended to a system of reduction rules (there are some messy extra cases for a^b ), and such expressions do have a (Cantor) normal form. Moreover, by comparing successive exponents of w and their coefficients , the trichotomy law holds, just as it does in N.

The union (equivalently, the collection) of all ordinals which may be expressed using finite towers of ws is called e₀. By induction over e₀ one can prove consistency of Peano arithmetic (with primitive recursion), so by Gödel's Incompleteness Theorem 9.6.2 general recursion must be needed to show that e₀ is well founded. Proof theory, which is typically concerned with extensions of first order logic by specific induction schemes, uses the representability of ordinals to measure the strength of formal systems. (The final section looks at other ways of doing this.)

Classical applications Refining Burali-Forti's argument, Friedrich Hartogs (1917) actually provided arbitrarily large ordinals.

LEMMA 6.7.11 For any set X, the set HX of ordinal structures on subsets (or subquotients) of X forms an ordinal, and there is no injective function HX\hookrightarrow X. In particular w₁ = HN is the first uncountable ordinal and others are defined by iteration (Example 9.6.15(b)). []

EXAMPLE 6.7.12 Let s:Q® Q be a monotone endofunction of an ipo (Definition 3.4.4). With a = HQ, define p:succa® Q by Theorem 6.7.4. Then x = p(a) satisfies \lnot \lnot (x = sx). If there is a fixed point, y Î Q such that (y = sy), then x £ y. []

Assuming excluded middle, x is the least fixed point, and conversely Exercise 6.54 derives ordinal-indexed joins from least fixed points.

Besides transfinite recursion, ordinals also provide a convenient way of performing constructions which require the axiom of choice.

PROPOSITION 6.7.13 The following are classically equivalent:

(a): the axiom of choice as Ernst Zermelo gave it: for any set Q, there is a function c:P(Q)\{Æ} ® Q with "U.c(U) Î U;
(b): the well-ordering principle: any set carries an ordinal structure;
(c): Zorn's lemma: any ipo (Definition 3.4.4) has a maximal element;
(d): the axiom of choice as given in Definition 1.8.8.

Recall that the axiom of choice implies excluded middle (Exercise 2.16).

PROOF: [cÞ d] and [dÛ a] were Exercises 3.16 and 2.15.

(a): [[aÞ b]] Define p(a) = c(Q \{p(b)|b Î a}) by the General Recursion Theorem 6.3.13. Hartogs' Lemma bounds the ordinal needed.
(b): [[bÞ a]] It is the typical use of the well-ordering principle to let c(U) be the first element of U.
(c): [[aÞ c]] Apply Example 6.7.12 to s(x) = c({y|x £ yÙ x ¹ y}); as there is no fixed point, the construction must cease to be possible at a certain stage, which is a maximal element. []

Axiomatisation It is quite usual for intuitionistic logic to cause a bifurcation of definitions and results. The Cantor Normal Form fails as a theorem, but proof theory has found ordinals in this form (as synthetic expressions) valuable as a measure of complexity. On the other hand, what should the analytic second order definition of ordinal be? One objective of such a theory would be to replace the set theory which still litters infinitary algebra, for example linking the ``cardinal'' rank in Definition 6.1.1 to the ordinal one in Proposition 6.7.6, and to preservation of k-filtered colimits.

Cantor's definition is equivalent, classically, to requiring the relation \prec to be transitive, extensional and well founded. The successor is just aÈ{a} and the condition "x.x £ \ops_Q (x) suffices for the Transfinite Recursion Theorem 6.7.4. However, intuitionistically, we only have

omitted array environment

There are several intuitionistically inequivalent notions of ordinal, (some of) which André Joyal and Ieke Moerdijk [JM95] characterised as the free structures in the sense of Theorem 6.7.4 for which s (a) satisfies no condition, (b) obeys x £ s(x), (c) is monotone or (d) preserves binary joins. These correspond to sets, transitive, plump and directed ordinals.

The problem lies in the confusion of Î with Ì in the classical theory (where

b Ì aÛ b Î aÚb = a). In fact there are three relations: the partial order should be treated a priori separately even from the inclusion defined in terms of the (irreflexive) well founded relation.

REMARK 6.7.14 The results of Section 6.3 may be developed for the functor shv:Pos® Pos (Definition 3.1.7) in the place of the covariant powerset P:Set® Set. A shv- coalgebra is well founded in the sense of Definition 6.3.2 iff \prec is a well founded relation (Definition 2.5.3). The difference between sets and ordinals lies in the notion of ``mono'' used to define extensionality, viz inclusions of lower sets. The structure map

(X, £ ) \hookrightarrow ^{a ® {x|x\prec a}} {U Ì X| "u,x.x £ u Î UÞ x Î U}

of a well founded coalgebra is the inclusion of a lower set iff

"a:X."U:shv(X). U Ì {x |x\prec a} Þ $b:X.U = {x|x\prec b}.

Like Cantor's definition, this property implies that \prec is extensional and transitive in the usual sense, and £ coincides with the inclusion order. However, this definition of plump ordinal is much stronger: U is an initial segment of X, and so is also a plump ordinal. Therefore

U Ì a Î XÞ U Î X,

which, classically, reduces to transitivity as U Ì aÛ U Î aÚU = a.

The successor operation must be adjusted accordingly: now succa does indeed consist of all initial segments of a, and all four implications above become reversible. But, as their name suggests, plump ordinals consist of rather more than the matchstick pictures: 2 is W, 3 is shv(W) and the axiom-scheme of replacement seems to be needed to construct plump w.

Unlike the classical ones, neither the transitive nor the plump ordinals need be directed (intuitionistically): this rather useful property has to be imposed separately (and hereditarily). In fact the same strategy, now in the category of binary semilattices (with Ú but not necessarily ^), provides directed plump ordinals. These are better in that only directed joins are needed in the Transfinite Recursion Theorem 6.7.4.

Written in set-theoretic language, these intuitionistic notions of ordinal are very complicated [Tay96a]. Category theory has come to the rescue, restoring the unity of the theory by abstract analysis of the old notions of extensionality and well-foundedness. The constructions may then be applied to other functors which do not have the ontological pretensions of the powerset, but the resulting ordinals have quite different properties.

For example, we use least fixed points in informatics, and expect to find a system which, unlike Cantor's, naturally stops at w, this being its own successor, rather than being curtailed perfunctorily à la Hartogs [Tay91]. For this, TX in Dcpo is the lattice of Scott-closed subsets of X, where we used lower subsets in Pos above. Domains of this kind have been generalised by Roy Crole and Andrew Pitts [ CP92] to ``FIX objects'' in more complicated recursive structures. The induction scheme must be restricted to a smaller class of predicates, such as the Scott-continuous ones in Theorem 3.7.13.

The proof of Proposition 6.7.13 above was not Zermelo's, which instead constructed the well founded relation on Q using the substance of Q itself, rather than the monolithic structure of all ordinals. This monolith has failed quite spectacularly to provide an intuitionistic proof of the fixed point theorem in Example 6.7.12: such a proof was found in the centenary of Burali-Forti, but depends on no such machinery. We have already given it in Exercise 3.46, and in fact it is very similar to Zermelo's 1908 argument (Exercise 6.53).

The remaining chapters build and study Cn^[]_L, often using induction, recursion and parsing, but only considering finitary connectives in the fragment [].