After the connectives x, ® and +, the next thing we want to salvage from set theory is the system of natural numbers, equipped not only with arithmetic but also with induction and recursion. The latter have a long history in number theory, but the idioms are readily transferred to lists, trees and recursive programs. Set theory itself, with the axiom of foundation, admits a more general notion of induction on the Î relation, of which transfinite iteration is a useful special case. Another similar, also rather disorderly, form of induction is that needed to prove properties of selfcalling programs. These will be our starting point.
Induction and recursion are often confused in symbolic logic, but the difference between them is another one between propositions and types: we prove theorems by induction, but construct programs by recursion. The two are linked by the General Recursion Theorem 6.3.13 (it is not enough ``to prove by induction that p(x) is defined,'' as this wrongly treats existence as a predicate).
DEFINITION 2.5.1 A selfcalling program is said to obey the recursive paradigm if it defines a function p(x) of exactly one argument and it proceeds as follows:
We shall call phases (a) and (c) parsing and evaluation respectively, and write u\prec t whenever u is a subargument of t. This may mean that u is symbolically a subexpression of t, in which case we're doing structural recursion, but we shall use the notation and terminology for any recursive problem which fits the paradigm, whatever the means by which the us are obtained from t.
Recursive definitions of functions with many arguments, and systems of functions which call each other ( mutual recursion), are no more general than the paradigm, because the arguments or results may be packaged into a ``tuple.'' The complication which we have not allowed is that a sub result might be fed back to help generate another sub argument. The effect of the paradigm is to allow the parsing of hereditary subarguments to be exhausted before any evaluation is done; this will be expressed more formally in Definition 6.3.7. A special case, tail recursion, is discussed in Section 6.4, where it is shown to be equivalent to the imperative while construct.

How do we prove that such programs are correct? There is nothing to say in the case of the factorial function, because it is evaluated directly from its definition. The correctness of the compiler, ie of its output (object code) relative to the input (source), is given case by case for the connectives of the language: for example the Floyd rules in Sections 4.3, 5.3 and 6.4 for imperative languages say that if the subexpressions satisfy certain conditions then so does the whole program(fragment). Finally, Gauss's paper gave a method of deriving the auxiliary polynomial and showed that if it has a root then this provides a root of the original one.
Induction In all of these arguments, and whenever we want to prove consequences of recursive definitions, it is necessary to demonstrate
a typical instance, q[t], of the required property, from "u.u\prec tÞ q[u], the induction hypothesis, that all of the subarguments have the property,
by methods which are peculiar to the application. Common usage often forgets that the q[t] so proved is not that of the ultimate conclusion, "x.q[x], but only a stepping stone. In order to justify a single q[a], the induction step must be proved for every t (or at least for all descendants of a). Such sloppiness in the presentation can give the impression that induction is circular.
To obtain the desired result, we have to use an additional rule, of which the proof ``by induction'' is the premise.
DEFINITION 2.5.3 The induction scheme is the rule

The important point formally is that the variable t in the premise and the induction hypothesis which it satisfies are bound in an ("Þ Á) proof box, so nothing is to be assumed about t apart from this.

We have followed tradition in using a symbol \prec suggesting order for well founded relations, but this is extremely misleading. The motivating examples of immediate subprocesses and subexpressions above are not transitive, and Corollary 2.5.11 shows that \prec is irreflexive.
Genealogical analogies can be useful, but there is a conflict of intuition about the direction. In set theory it is traditionally synthetic: ``on the first day, God made Æ'' [Knu74]. Our recursive paradigm is analytic, and subprocesses are usually called children, so Æ is the ultimate descendant.
DEFINITION 2.5.4 If u\prec t or u\prec \prec t then we refer to u as a child or a descendant of t, respectively, where \prec \prec is the transitive closure of \prec . For any predicate f on X, we say that t is hereditarily f if all of its descendants satisfy f, ie




Whereas simple induction deals with orderly, stepbystep reductions (like nuclear decay by a or bradiation), numerous algorithms split up graphs into parts whose size may be anything between zero and the original size (like fission). For such problems we need courseof values induction. But the problem for a graph with n nodes does not usually reduce to subproblems involving graphs of every size < n (the products of fission of ^{235}U by neutrons typically have masses about 90 and 145), so \prec is usually sparser than < .
Section 2.7 considers induction and recursion for numbers and lists.
Minimal counterexamples If some property is not universally true of the natural numbers, there is a least number for which it fails. (Georg Cantor generalised this idea to define the ordinals, Section 6.7.) This least number principle depends on excluded middle, but with a little care such proofs can be made intuitionistic.
PROPOSITION 2.5.7 [Richard Montague, 1955] A relation \prec is well founded iff every nonempty subset has a \prec minimal element.
PROOF: Let Æ ¹ V Ì X, and put q[x] = (x Ï V). This means that the conclusion of the induction scheme ( "x.q[x]) fails. The scheme is therefore valid iff the premise also fails, ie

REMARK 2.5.8 The classical idiom of induction is to use the minimal counterexample to show that "x.q[x] by reductio ad absurdum. Like Example 1.8.2, this is very often gratuitous. The hypothesis ``let t be a minimal counterexample'' breaks into two parts:
The second part is the induction hypothesis as given in Definition 2.5.3. Commonly, q[t] can be deduced without using the first part, so this may be eliminated to give a sound intuitionistic argument.
Imagine a proof by induction as the tide climbing up a beach: in the end the whole beach gets submerged. The induction scheme requires us to show that any part which is above only water soon gets wet. The classical induction hypothesis is that there is somewhere which is on the waterline but still completely dry  a rather implausible supposition!
COROLLARY 2.5.9 Any (properly) recursive definition of a function which terminates on all values includes an analysis ( parsing) into
This feature of well founded structures and free algebras will recur many times, in the rest of this chapter, in Sections 3.7 3.8, and in Chapter VI.
Descending chains Dependent Choice (Definition 1.8.9) gives another condition which is nonconstructively equivalent to wellfoundedness.
PROPOSITION 2.5.10 (X,\prec ) is well founded iff there is no sequence of the form ···\prec \termu_{3}\prec \termu_{2}\prec \termu_{1}\prec \termu_{0}.
PROOF:
The geometrical form of Euclid's algorithm (Example 6.4.3) was perhaps the first statement of induction: an infinite sequence of numbers is found, each less than the one before, which, as Elements VII 31 says quite clearly, is impossible amongst whole numbers. When this algorithm is applied to the side and diagonal of a square or pentagon, however, the successive quotients form a periodic sequence. David Fowler [Fow87] has made an appealing reconstruction of classical Greek mathematics, in which he claims that Euclid's book was motivated by problems like this. (The story that the discovery of the irrationality of Ö2 led to the downfall of the Pythagorean sect seems to have been an invention of Iamblichus nearly 1000 years later.)
In the investigation of the triangle which bears his name, Blaise Pascal (1654) stated lemmas for one case and for any ``following'' one, and concluded that the result holds for all numbers. John Wallis (1655) may also have been aware of the logical principle, but Pierre de Fermat (1658) was the first to make nontrivial use of the method of infinite descent to obtain positive results in number theory (Exercise 2.33).
A variant of this result is (Dénes) König's lemma (1928). A tree is an oriented graph with a distinguished node (its root) from which to each node there is a unique oriented path. Since in most applications the branches (outgoing arrows) from any node are labelled in some fashion, the Choice of branch may usually be made canonically. To any programmer, the procedure in König's lemma is depthfirst search.
COROLLARY 2.5.11 If, in a tree, every node has finitely many branches and there is no infinite oriented path, then the tree is finite.
PROOF: As there is no w^{op} sequence, the branch relation is well founded, so induction on it is valid. Using this, every subtree is finite. []
Using this formulation of induction is not so easy as it may appear: somehow we have to find that forbidden infinite sequence, which usually requires König's lemma.
Notice that we never said that the terms in the sequence had to be distinct; of course if only finitely many values may occur then any infinite sequence of them must contain repetitions ( loops).
COROLLARY 2.5.12 If the relation has a cycle, u\prec u, or u\prec v\prec u, or u\prec v\precw\prec u, etc , then \prec is not well founded. []
Proof trees The graphical presentation, as a tree without cycles, shows the difference between the induction step which is to be proved and the induction hypothesis which can be assumed.

REMARK 2.5.13 The ingredients for an inductive proof of q[n] by the Peano primitive induction scheme are the axiom/zero q[0] and the rule/successor "n.q[n]Þ q[n+1]. If we need to prove q[3], then these may be composed, showing the different roles of the occurrences of q. Compare this with the proofs we called 0, 1 and 2 in Example 2.4.1, and notice how the structure of the term directly gives that of the proof.
Termination Infinite descending sequences are very familiar to anyone who has tried to execute an ill founded recursive program: the sequence is stored on the machine stack, which overflows. (Nonterminating loop programs are usually silent.)
REMARK 2.5.14 (Robert Floyd, 1967) In what circumstances can a recursive program p which obeys the paradigm (Definition 2.5.1) fail to terminate on a particular argument \termu_{0}? If, and only if, the execution of p(\termu_{0}) involves the computation of some p(\termu_{1}), which in turn requires some p(\termu_{2}) and so on ad infinitum.
For any two values t and u of the domain of definition, write u\prec_{p}t if u is one of the subarguments generated in the first step of the computation of p(t). Then the program p terminates on all arguments iff \prec_{p} is well founded. Let q[x] denote that p terminates on the argument x. (This is a higher order predicate; indeed its mathematical content is really that \prec_{p} restricted to the descendants of x is well founded.) The premise of the induction scheme holds by assumption on the form of p, that the first and third stages of the program terminate. The conclusion is universal termination, so this is the case iff the induction scheme is valid. []
A good definition in mathematics is one which is the meeting point of examples and theorems, cf introduction and elimination rules in logic. Definition 2.5.3 is such a meeting point. Minimal counterexamples and descending sequences may perhaps help to give you a picture of how induction works, but they do not prove it. Without excluded middle, nor can they even prove the results for which induction is needed.