Practical Foundations of Mathematics

Paul Taylor

1.2 Denotation and Description

The basic understanding of operations in an arithmetic expression is that they transform their arguments into results, eg

sin150^o \leadsto

The two sides are not simply equal: we have to do a calculation, or at least consult a dusty table of trigonometric functions. In algebra we say

ux(v+w) = (uxv)+(uxw);

one side may get us closer to the desired result than the other, but which it is depends on the situation.

EXAMPLE 1.2.1 A rational number may be represented as a fraction n/d with d š 0 in many ways. The synonyms are related, written

n,d ~ n˘,d˘, if n d˘ = n˘d.

There is a normal form, obtained by Euclid's algorithm (Example 6.4.3), in which n and d have no common factor. But there is no need to reduce everything to normal form at each step of a calculation , since the arithmetic operations respect the equivalence relation.

Similarly, a (positive, zero or negative) integer z may be coded as the set of pairs p,n of (zero or positive) natural numbers subject to

p,n ~ p˘,n˘ if p+n˘ = p˘+n,

so p-n = z. The ``obvious'' way of coding integers, as absolute values together with signs, needs a case analysis to define addition. This is avoided by using equivalence classes, and we see that it is not necessary to use normal forms; for example you may have both savings and a loan with a particular bank.

In these two examples it is necessary to check that

x ~ x˘Ůy ~ y˘Ţ r(x,y) ~ r(x˘,y˘)

for each arithmetic operation r = +,x,-, etc , to show that these are well defined for the values (equivalence classes) of rationals and integers.

Platonism and Formalism As there are syntactically different terms which are synonyms for the same value, the expressions themselves can never capture exactly what mathematical values are.

Various mathematical philosophies offer different views about what these eternal values are meant to be. One view is that the constants are tokens for real things like sheep and pebbles (Exercise 1.1) and the operations combine them. Our investigations are merely passive observations of an eternal and unchanging world, in which there is a true answer to every question, even though we may be (provably) unable to find it. This is known as Platonism.

Although most mathematicians habitually think and speak in Platonist language, this philosophy is ridiculously naive . It brings mathematics down to an experimental science, in which we can only infer laws such as the associativity of addition by scientific induction from the cases which have been observed. How can we know when all of the laws have been codified? Have they been asserted in excessive generality? We don't know. Bertrand Russell noticed that Gottlob Frege's use of comprehension was too general, and Kurt Gödel showed that Russell's own system (or anything like it) could express certain true facts about itself which it could not prove, unless it was inconsistent.

Formalism denies absolute being: only the symbols themselves exist, and nothing else. The notion of value must then be a derived one: it is defined as that which is common to all of the other expressions which are directly or indirectly related to the given one. What is common is simply their totality: the equivalence class.

Formalism is regarded by some as nihilism: if mathematics is just a game with symbols and arbitrary rules, what's the point of playing it? Chess is also a game with arbitrary rules played by many mathematicians, but to a chess master it has latent structure, a semantics. We hope to show that the rules of logic are not so arbitrary as those of chess, and exhibit many of the symmetries which mathematicians find beautiful in the world.

So long as we fix our sights on a reasonably small fragment of logic, there is a synthesis between the two points of view: out of the formalism itself we construct a world which has exactly the required properties. The free algebra for an algebraic theory is an example of such a world; it satisfies just those equations which are provable. Joachim Lambek has argued [ LS86, p. 123] that it is possible to reconcile the Platonist and Formalist points of view.

Laws as reduction rules Using the tree notation, the distributive law, seen as a way of transforming or computing with expressions, is

omitted diagram environment

We distinguish between equations, which may or may not hold, and laws, which hold because (as in the legal sense) we choose to enforce them. In the l-calculus the term d-rule is used, and occasionally we shall call an ah doc imposed equation a relation (``re gulation''), though this term is normally reserved for another sense. Beware that the word law is also used elsewhere for what we call an operation, such as multiplication; this sense is archaic in English but current in French ( loi de multiplication).

DEFINITION 1.2.2 In a law the variables on the left hand side name sub-trees in a pattern to be matched. The variables on the right say where to copy these sub-trees (maybe several times, maybe not at all), so every variable occurring on the right must also occur on the left.

Any (sub-)expression which matches the pattern on the left is called a redex ( reducible expression), and it reduces to the substituted expression on the right. A term may have many redexes in it.

The result of a ``reduction'' may be a longer expression than the first, and there may indeed be infinite computations. A sequence of terms in which each is obtained by replacing a redex in the previous one by the term to which it reduces is called a reduction path. We write u\leadsto v if (u and v are identical expressions or) there is a reduction path whose first and last terms are u and v respectively.

A term with no redexes is said to be irreducible, but note that this only means that we have reached a dead end: by going backwards and following another reduction path some different result might be obtained.

Equivalence relations For an arbitrary set of laws we have no guide to say whether a reduction gets nearer to a ``result'' or makes matters worse. In this case the equivalence class is the only notion of static value which we can give to an expression, and there is no systematic way of determining whether two expressions are equal.

DEFINITION 1.2.3 Reduction, \leadsto , is the sparsest binary relation which contains reduction of redexes, respects substitution and is reflexive and transitive:

omitted prooftree environment omitted prooftree environment

Some authors, particularly when studying term-rewriting, use \leadsto ^* for the reflexive-transitive closure of \leadsto , after Russell's \relR_*. Another notation for this is \twoheadrightarrow , but we shall not use either \leadsto ^* or \twoheadrightarrow in this book.

Equivalence, ~ , is the least relation which is symmetric as well:

omitted prooftree environment

Any relation which is reflexive, symmetric and transitive is called an equivalence relation. These are essentially Euclid's Common Notions.

(We have just started using a way of expressing conditional assertions, ie rules of deduction, which will be very useful throughout the book. The ruled line means that whenever what is written above holds then what is written below follows, for the reason on the right.)

The notion of reduction path is an example of the transitive closure of a relation; it is reflexive if we include the trivial path (the equality or identity relation). Although any relation can be made symmetric by adding its converse, ie making the arrows bi-directional, this may destroy transitivity. To form the equivalence closure we need

LEMMA 1.2.4 Let Ž be any reflexive-transitive relation, viewed as an oriented graph. Then two nodes u and v are related by the smallest equivalence relation containing Ž iff there is a zig-zag, ie a sequence u = \termu₀,\termu₁,\termu₂,ź,\termu_2n = v where

omitted diagram environment

We say that u and v are in the same connected component, whatever the orientations of the arrows involved. []

We shall discuss the transitive closure in Sections 3.8 and 6.4.

In a formal system of expressions and laws with no underlying Platonist meaning, it may be that there are so many indirect laws that the terms which we have chosen to name the numbers and truth values all turn out to be provably equal. This is known as algebraic inconsistency.

Confluence This definition captures the Church-Rosser Theorem, which showed that the pure l-calculus is consistent (Fact 2.3.3).

DEFINITION 1.2.5 A system of reduction rules is said to be confluent if, whenever there are reduction paths t\leadsto u and t\leadsto v, there is some term w with reduction paths u\leadsto w and v\leadsto w. Other names for this property are diamond and amalgamation.

What we prove in practice is local confluence, that any two one- step reductions may be brought back together. If it only takes one step on each side to do this then the following result still holds. Usually more than one step is needed, so the paths to be reconciled may just get longer.

Confluence is a property of the presentation of a system of rules. Donald Knuth and Peter Bendix [KB70] showed how a system of algebraic laws may sometimes be turned into a confluent system of reduction rules.

LEMMA 1.2.6 Suppose u and v are equivalent terms with respect to a transitive confluent relation \leadsto . Then u\leadsto w and v\leadsto w for some term w. In particular for each term there is at most one irreducible form to which it is equivalent, and distinct irreducible forms are inequivalent, so we call them normal.

PROOF: Confluence changes each ``zag-zig'' to a ``zig-zag,'' as shown with dotted arrows in the diagram accompanying Lemma 1.2.4. []

EXAMPLE 1.2.7 Soldiers in Lineland face towards the right if they have even rank (0, 2, 4, ...), whilst those of odd rank face left. A sequence of soldiers, which we write with semi-colons between them, is called a parade if each can see only those facing in the same direction and of the same or lower rank. Adjacent soldiers annihilate if they are facing each other and their rank differs by exactly 1 ( eg 2;3 or 6;5). Otherwise, if a junior is facing an adjacent senior of either parity then they change places, but the senior loses two grades ( eg 4;7\leadsto 5;4, 6;8\leadsto 6;6, 3;1\leadsto 1;1).

Since the total rank decreases, any conflict of forces terminates, and the result is a parade. An algebraist might take the parades alone as the elements of the structure, and seek to show that (;) is an associative operation. The term-rewriting approach would show that the reduction rules are confluent. In fact exactly the same calculations are needed either way, and we deduce that any conflict has a foregone conclusion. Hence parades form a monoid whose unit is the empty sequence; we shall use it in Example 7.1.9. []

DEFINITION 1.2.8 A system of reduction rules is weakly normalising if every term has some reduction path which leads to an irreducible form. It is strongly normalising if there is no infinite reduction path.

THEOREM 1.2.9 In a normalising confluent system of reduction rules, each term is equivalent to exactly one normal form, and can be reduced to it. If the system is strongly normalising then local confluence suffices, and every reduction path leads to the normal form. []

Normal forms, where they exist, provide a versatile tool. Suppose that some formula is provable in a logic whose proofs have this property; then (without being given a proof) we know that any such proof must be of a particular form, involving certain other formulae which are also provable. This observation can be used to prove a powerful result about Ú and $ in intuitionistic logic (Remark 2.4.9 and Section 7.7).

There are idioms in the English language which exploit uniqueness, such as that provided by confluence.

Theory of Descriptions In the vernacular we speak of the Moon, using the definite article, because Earth has only one. By contrast, Ganymede is a satellite of Jupiter: the indefinite article is used since there are several others. Similarly, when there is exactly one term v which is normal and equivalent to u, we call it the normal form of u.

DEFINITION 1.2.10 Let f[x] be a predicate of one argument. Then

(a)

f is a description if it is satisfied by at most one thing. The best way to say this is that any (") two solutions are equal or, in symbols,

"x, y. omitted prooftree environment

(b)

a description f denotes if something ($) satisfies it:

$x.f[x].

In this case we may speak of the solution, ie what the description describes or denotes. The notation

ix.f[x]

is sometimes used for this solution. We write $!x.f[x] for unique existence, the conjunction of $x.f[x] and "x, y.f[x]Ůf[y]Ţ x = y. Some equivalent forms will be given in Exercise 1.8.

It is convenient to extend usage to descriptions which are not known to denote, that is, to a general situation in which we would recognise a widget if we met one and know that there cannot be two which are different, but where there need not actually be any. Then we may say

``the widget is grue''

to mean that, should we ever find ourselves in possession of a widget, it will necessarily be found to be grue, where grueness (y[x]) is any predicate, not necessarily another description. In symbols,

"x.f[x]Ţ y[x] where "x, y.f[x]Ůf[y]Ţ x = y.

As a further abuse,

``the widget (if it exists) is unique''

means that the predicate f characterising widgets is a description.

``The'' and ``unique'' must be used with predicates: it is meaningless to say that a thing is unique, whatever abuses advertisers may make of the English language. It is similarly wrong to introduce putative things such as unicorns and then treat existence as a property of them.

LEMMA 1.2.11 Let f be a description and y any predicate. Then

if $x.f[x]Ůy[x] then "x.f[x] Ţ y[x]

and conversely if f denotes; in the latter case we write

é
ë

ix.f[x]

ů
ű

That is, if we have a grue widget, then we know that all widgets are necessarily grue.

PROOF:

omitted prooftree environment

[]

The i-calculus notation had been in use in the late nineteenth century, but it was Bertrand Russell who sorted out its meaning. A description such as ``the author of Waverley'' is not a name - it is incomplete, having no meaning until it is embedded in a sentence such as ``Scott is the author of Waverley'' - and may be contingent (``the President''). Unfortunately some logic texts to this day assign an arbitrary value such as zero to the two meaningless cases, leading to nonsense like saying that

the unicorn is the author of Principia Mathematica

is true since 0 = 0 (and this book had two authors).

Synonyms The theory of descriptions puts a premium on uniqueness, but how can we reconcile this with the stress we put before on the many concrete forms which a mathematical object such as 5 may take? One way is to appoint either the equivalence class ( all quintuples) or a normal form ( {0,1,2,3,4}) to be what the object is. The former is objectionable because it introduces extraneous material (the protons in a Boron atom, as well as the fingers on one hand), and indeed a proper class of it, whilst Example 1.2.1 shows that the latter is unnecessary. As Pál Halmos [Hal60] points out, we look to physics for the way standards are defined: when we want to measure a kilogram, we compare it with a particular platinum-iridium bar in France. See [Ben64] for discussion.

The deeper we dig into foundations, to find out what things do, the less sure we are about what they are: we must be content with knowing how to exchange equals for equals.

DEFINITION 1.2.12 A binary relation ~ is a congruence if it is an equivalence relation (reflexive, symmetric and transitive) and permits the substitution of equals for equals in any predicate f:

omitted prooftree environment

If ~ is the equivalence relation generated by reduction (\leadsto ), it suffices to verify this rule for a\leadsto b and b\leadsto a (with w = f[x], this is post-subs in Definition 1.2.3), and this property is known as subject reduction.

The doctrine of interchangeability does not allow us to test for equality in substance, and so the `` = '' sign in Definition 1.2.10 and the proof of Lemma 1.2.11 must be replaced by congruence.

So any two things satisfying the description have to be congruent rather than equal. Conversely, the congruence law says that anything which is equivalent to something satisfying the description also satisfies it.

A description specifies uniquely up to interchangeability.

The idea is that two things are the same if they have all properties in common, cf Leibniz' principle, Proposition 2.8.7. Finally, testing a property for any one representative suffices, by the congruence law again, to prove it for all of them.

For terms denoting individual values these remarks are maybe academic, but significant technical issues arise when we turn to structures. Then the congruence is not just a property but a method of passing from one representation to another and back ( isomorphism). Now the admissible properties are those which are invariant with respect to isomorphism. We often say things like ``the product is unique'' or ``the quotient is compact,'' to be understood in the above senses.

Since a structure may have non-trivial isomorphisms with itself, few interesting properties survive indiscriminate isomorphisms; for example all of the points on a sphere become identified. There are two isomorphisms between my shoes and my feet, but choosing the wrong one is rather painful! What isomorphisms there are with my socks is a significant one geometrically, as a non-trivial group is involved (Example 6.6.7). In order to express what we have to say about a structure we must follow the parts of the structure through the transformations.

Therefore mathematical objects are defined up to interchangeability , but we must pay attention to the means of exchange: isomorphisms of objects and equivalences of categories (Definition 4.8.9). Opinions differ on how to handle this issue, for example Michael Makkai [Mak96] has taken an extreme semantic point of view in which equality of objects is unthinkable . The syntactic position is that we only deal with names of types, which can therefore be compared. In Section 7.6 we shall show that this conflict can be resolved, by using interchangeability of categories to restore equality of objects.

We shall explore the relationship between the formal and vernacular ways of expressing mathematics further in Section 1.6. The idea of interchangeability and the means of exchange will be used in Section 4.4. Algebras with laws are the subject of Section 4.6, and Section 5.6 treats equivalence relations. Now we turn to parametric descriptions.