Practical Foundations of Mathematics

Paul Taylor

1.4 Direct Reasoning

To the first approximation, a proof is a sequence of assertions, each justified by the earlier ones. We don't actually perceive the truth of the later assertions by themselves, but only accept them because, like accountancy, they follow necessarily from what's gone before. David Hilbert proposed that this style be used systematically in mathematical logic. We shall need to modify it in the next section, but those proofs which can be written like this we call direct deductions.

The style has been traditional in editions of Euclid's Elements since the invention of printing:

First we write down the hypotheses, each on a separate line and annotated as such. Then we write any formulae we wish to derive from them (and from formulae which have been derived already), noting the names of the rules and hypotheses used.

Each step is introduced by \therefore ( therefore) and its reason by \because ( because). The reason cites one of a small number of rules of inference and some previously asserted formulae (the premises of the rule). In presenting the rules we shall employ the ruled line notation used in Definition 1.2.3.

Each one of the rules makes an appearance somewhere in ancient or medieval logic, long before Boole. So they all have Latin names, which shed no light at all on their structure (even if you can read Latin). As Russell commented, ``Mathematics and Logic, historically speaking, have been entirely different studies. Mathematics has been connected with Science, Logic with Greek.'' The only Latin name in common use in mathematics is modus ponens, which we call (Þ E ). The symbolic names reflect the symmetries of logic.

The language of predicate calculus

DEFINITION 1.4.1 Formulae are built up as follows:

(a)

Atomic predicates r[\termu₁,¼,\termu_k] where \termu_i is a term of type \typeX_i. The number k is called the arity. Predicates for which k = 0 are usually called propositions; in particular there are constants

T (true) and ^ (false).

If k ³ 2 we speak of relations, as in the last section. For example we have the binary relations of equality, u = v, order, u £ v, and membership, u Î w (Definition 2.2.5), where u,v:X and w:P(X).

(b)

If f and y are formulae then so are

fÙy (and), fÚy (or) and fÞ y (implies);

\lnot f (not) and fÛ y (equivalent) are abbreviations for fÞ ^ and (fÞ y)Ù(yÞ f) respectively. The arrow should be read ``in so far as f (the antecedent) holds, then so does y (the consequent).'' These symbols are collectively known as connectives.

(c)

If f is a formula and x a variable of type X then

$x:X.f ( for some) and "x:X .f ( for all)

are formulae. The symbols " and $, which are called quantifiers, bind variables in the way that we described in Definition 1.1.6ff.

The phrase ``there exists'' will be discussed in Remarks 1.6.2(f) and 1.6.5.

The direct logical rules

DEFINITION 1.4.2 The direct logical rules are

= 0pt omitted array environment

In the last column, x is a variable (which, by a-equivalence, may be assumed not to occur elsewhere) and a any term of the same type. This is called an instance of "x.f[x] or a witness of $x.f[x]. Using the substitution notation (Definition 1.1.10ff), these two rules become

omitted prooftree environment omitted prooftree environment

to which we shall return in Remark 1.5.5.

REMARK 1.4.3 The names of the rules specify the connective involved - that is, the outermost connective (Remark 1.1.1), as the formulae f and y may themselves involve connectives. The Á stands for an introduction rule, ie where the formula in the conclusion is obtained from that in the premise by adding the connective in question. Similarly E indicates an elimination rule, where the connective has been deleted. We employ the introduction rule to give a reason for the formula, and the elimination rule to derive consequences from it.

Notice how the elimination rules for Ù and " mirror the introduction rules for Ú and $. (TÁ ) and (^E ) and, to some extent, the other rules also match up, although the duality is seen more strictly in classical logic, Theorem 1.8.3. The other three direct logical rules each have two premises, so cannot have mirror images.

omitted prooftree environment omitted prooftree environment omitted prooftree environment

For negation, (\lnot E ) is the special case of (Þ E ) where y = ^.

REMARK 1.4.4 The (^E )-rule comes as a surprise to novices. It is like playing the joker in a game of cards: ^ stands for any formula you like. The strategy usually disregards this card ( minimal logic leaves it out), but it can sometimes save the day, eg in Remark 1.6.9(a).

REMARK 1.4.5 There are two (ÙE )- and (ÚÁ )-rules, whilst (ÙÁ ) has two premises and (ÚE ) has two sub-boxes (Definition 1.5.1), since these connectives are binary. The nullary (T, ^) and infinitary (", $) versions follow the same pattern: in particular, there is no (TE )- or (^Á )-rule.

Since the (ÙÁ ), (Þ E )- and (\lnot E )- rules, together with transitivity (1.2.3), description (1.2.10(a)) and congruence ( 1.2.12) have two premises, the ancestry of a deduction is not linear but tree-like, eg Lemma 1.2.11. Proofs, like expressions, involve binary operation-symbols, which we shall come to recognise as pairing for (ÙÁ ) and evaluation for (Þ E ).

Whilst the tree style shows more clearly the roles played by individual hypotheses in a deduction, it can repeat large parts of the text when a derived formula ( lemma) is used more than once as a subsequent premise, as in induction (Remark 2.5.12). Big sub-expressions also get repeated in algebraic manipulation, but this can be avoided by use of declarative programming (Definition 1.6.8 and Section 4.3).

The provability relation The presentation of a proof as a chain or tree of assertions is very convenient when the aim is to show that some result in mathematics is true. But from the point of view of logic per se, we need a notation which says that ``there is a proof of q from hypotheses f₁,¼,f_n.'' This list is to be understood conjunctively ( f₁Ù···Ùf_n); it is called the context and will be denoted by G. (Such contexts must not be confused with ``context-free languages,'' Example 4.6.3(d).) The provability assertion is written G\vdash q and is called a sequent .

REMARK 1.4.6 Provability is an inequality f\vdash q on formulae, whereas reduction rules (Section 1.2) defined when two expressions were equal. Ernst Schröder observed that the inequality is more natural in logic. Formulae are then equal if both f\vdashq and q\vdash f, abbreviated to f\dashv \vdashq.

Beware that f₁,f₂\vdash q₁,q₂ means f₁Ùf₂\vdash q₁Úq₂ in Gentzen's classical sequent calculus, but for us the comma means conjunction on both sides.

The structure consisting of inter-provability classes (equivalence classes under \dashv \vdash ) of formulae is called the Lindenbaum algebra. In a much more general form (the classifying category or category of contexts and substitutions, Section 4.3) it will be the major object of study in this book; the weakening rule will play a crucial role in our construction.

Sequent presentation Whereas the rules of natural deduction say

if f is true then q is also true,

those of the sequent calculus have the form

if f is provable from G then q is provable from D,

where G and D are contexts.

DEFINITION 1.4.7 The provability relation, written using the turnstile \vdash (which comes from Frege, 1879), is generated by three classes of rules:

(a): the structural rules, which govern the way the formulae move around the proof,
(b): the logical rules, which determine the way in which the connectives and quantifiers behave, and
(c): the non-logical rules, which relate to symbols in the object- language.

For the logical rules we may make other distinctions: between the direct rules we have given and the indirect rules of the next section (using temporary hypotheses, which must be delimited somehow), and between introduction and elimination rules.

DEFINITION 1.4.8 The structural rules are the identity axiom, cut , exchange, weakening and contraction:

omitted prooftree environment omitted prooftree environment omitted prooftree environment

omitted prooftree environment omitted prooftree environment

The identity axiom and cut rule show straight away that the \vdash relation is reflexive and transitive (a preorder, Definition 3.1.1).

It is precisely the exchange and contraction rules which enable us to treat G as a set instead of a list (indeed in Section 8.2 we shall discuss a similar set of rules for type theory in which exchange is not generally valid, and there G must be considered as a list). The cut rule allows us to delete those hypotheses which are derivable from others; the exchange and contraction rules mean that the context of its conclusion is just GÈY.

The weakening rule says that hypotheses may be added to the context. If we allow weakening by an arbitrary set of hypotheses we can give a meaning to an infinite context: something may be proved in it iff it may be proved in some finite subset (though we shall not take this up).

REMARK 1.4.9 There are several ways of presenting the direct logical rules in sequent form, cf operation-symbols applied to either variables or terms (Remark 1.1.2). For example (\land E0) may be written as any of

fÙy\vdash f omitted prooftree environment omitted prooftree environment

The second and third are obtained from the first by using Cut on the conclusion and premise respectively. The moral of this is that, even when components of two systems perform the same function, the equivalence may rely on more primitive conditions.

Gerhard Gentzen [Gen35] used the third form in his sequent rules for intuitionistic logic. His Hauptsatz (German: main theorem) was that anything that can be proved in the sequent calculus is provable without the Cut rule: cut-free proofs are normal forms. All of his rules apart from cut have the sub-formula property: the formulae used in the premises are sub-formulae of those in the conclusion, so controlling the search for a proof. This proves consistency, but by using an induction principle which is stronger than the calculi under study.

Cut is redundant in Gentzen's calculus because he saturated the other rules with respect to it. This makes it a very cumbersome notation for justifying theorems, since the context must be copied from one line to the next, usually without change.