Practical Foundations of Mathematics

Paul Taylor

6.5  Unification

Now we shall apply the parsing and well-foundedness properties of term algebras to the unification algorithm of Remark 1.7.7. Our approach is a new one, which begins from an idea in universal algebra, and carries this through directly to a very efficient implementation.

The most general unifier has a universal property, which may be viewed in two ways: it is the equaliser in the category \CloneLx of contexts and substitutions (Example 5.1.3(b)) , and also the coequaliser of free algebras. The connection between these points of view is the fact that the clone Cn×L([x1:\typeX1,,xn:\typeXn],[y:Y]) is the free algebra on n generators (Theorem 4.6.7).

The most general unifier is not the coequaliser amongst all algebras.

EXAMPLE 6.5.1 Consider a unary operation s and two variables x and y. Then as a unification problem the equation s(x) = s(y) implies x = y, but the coequaliser as an algebra is essentially ``N with two zeros.'' []

Recall that the products in Cn×L must be preserved in any interpretation of the theory L in a category. The equalisers are not preserved because terms may have equal value without being provably so in the theory.

We shall discuss coequalisers amongst arbitrary algebras in Section  7.4, and characterise the subcategory of free algebras in Proposition 7.5.3(a). Section  8.2 constructs a version of the classifying category which does have some equalisers (and so whose interpretations preserve them).

For our work in this section we need first to restore the generators which we discarded in Definition 6.1.1. The recursion scheme for them is the universal property. The generators do not behave as extra constants as we said in Definition 6.1.1, because the homomorphism which solves the problem may send them to arbitrary terms in the target algebra Q.

NOTATION 6.5.2 Write hG:G FG for the inclusion of the generators G into the free algebra which they generate.

omitted diagram environment

Then any function f:G Q to another algebra extends uniquely to a homomorphism p:FG Q with hG;p = f. []

Unification   For any homomorphism f:A Q of algebras, the kernel pair {(a,b)|f(a) = f(b)} is an equivalence relation on A and is closed under its operation-symbols (Proposition 5.6.4). In a unification problem the kernel pair is also closed under parse :

LEMMA 6.5.3 Let f:A Q be a homomorphism from any algebra to an equationally free one (so even if A is free, f may send generators to expressions).

If f(\oprA([(u)\vec])) = f(\opsA([(v)\vec])) then r = s and "j:ar[r].f(\termuj) = f(\termvj).

omitted diagram environment

PROOF: Since f is a homomorphism,

\evQ (r,\vreq f(u)) = f(\oprA(
)) = f(\opsA(
)) = \evQ (s,\vreq f(v)),
but \evQ is mono, so r = s and "j:ar[r].f(\termuj) = f(\termvj). []

So two expressions are unifiable iff their outermost operation-symbols agree and the corresponding pairs of sub-expressions are each unifiable, but all by the same assignment to the generators. For unification to be possible, we must therefore be able to distinguish the generators and operation-symbols from one another, cf Example  6.2.5.

REMARK 6.5.4 Besides parse, the congruence is also closed under the operations r W and the axioms for an equivalence relation.

Transitivity via generators gives rise to parsable equations:
) = x     x = ··· = y     y = s(
where x = ··· = y is a zig-zag of equations amongst generators. (This affects the way in which we must prove termination: consider
x = u     y1 = \termv1     y2 = \termv2     y3 = \termv3     ···
r(x,r(x,r(x,))) = r(y1,r(y2,r(y3,)))
in which u reappears in a new equation arbitrarily deeply.)

It turns out that the other cases can be avoided:

transitivity of equations between terms - but from r([(u)\vec]) = r([(v)\vec]) and r([(v)\vec]) = r([(w)\vec]) we can deduce "j.\termuj = \termvj = \termwj anyway;

symmetry and reflexivity may be taken as read;

applying operation-symbols: {\termuj = \termvj |j:ar[r]} \vdash r([(u)\vec]) = r([(v)\vec]);

substitution of x = u into v = w, where x FV(v,w). By a similar argument this may also be postponed, unless v or w is the generator x, when transitivity via generators applies. []

LEMMA 6.5.5 Let p:FG Q be a homomorphism from a free algebra to any algebra. Then ( cf Lemma 1.3.8)

px = p u    py = p v px = p u    py = p
v[x: = u]

PROOF: By structural induction on v, considering x, y, z and r([(w)\vec]). []

REMARK 6.5.6 The previous Remark makes useful optimisations, as they avoid the need to compare or copy terms, providing a parallel, in situ algorithm which unifies terms almost as fast as they can be read. The terms may be represented in Polish notation (Example 6.2.7), but it is more usual to code each instance of an operation as a record consisting of the operation-symbol together with pointers to records for each immediate sub-expression (Remark 1.1.1). The pending equations are then held as pairs of pointers to these records. The program needs to store the equivalence relation E on generators, together with the equations R between generators and (pointers to) sub-terms.

The given and subsequently generated equations [(u)\vec] = [(v)\vec] do not need to be stored, since each may be dealt with as a sub-process:

if it is of the form r([(u)\vec]) = s([(v)\vec]) with r\not s then we have a clash;

the equation r([(u)\vec]) = r([(v)\vec]) forks into sub-processes for each uj = vj;

x = u or u = x is added to R, unless there is already an equation x E = xR = v, in which case u = v is handled as in the previous cases;

x = y is added to E and their equivalence classes amalgamated; if two R-equations become linked by u R = xE = x = y E = yR = v, then one of these is deleted and the unification step applied to u = v. []

LEMMA 6.5.7 The unification algorithm (applied to any finite set of equations between terms for a finitary free theory) terminates.

PROOF: Consider the total number of generators and operation-symbols in the outstanding equations, including the terms assigned by R to the generators. The generators listed in E and R are not counted. The operation-symbol r in case (b), or the generators x and y in cases (c) and (d), are deducted from this count at each step, so the program terminates by Proposition 2.6.2. (In fact only the outermost operation-symbol of u in case (c) is considered more than once, cf Remark 6.5.4(a).) []

THEOREM 6.5.8 The algorithm provides the most general unifier.

PROOF: The unification step continues to be applicable as long as there remains any equation relating a term to a term, possibly via generators. So when the iteration terminates, the outstanding equations consist of an equivalence relation E: G\leftharpoondown \rightharpoonup G together with a system of equations R:G\leftharpoondown \rightharpoonup FG such that, for x,y G and u,v FG,

u R = x E = y R = v      u v.
Then any unifier of E\rightrightarrows G FG factors through G/E and so F(G/E). By construction, the equations R link equivalence classes [x] G/E of generators to expressions, so without loss of generality E is ( = G).

omitted diagram environment

We want to partition the set G = D+I into dependent and independent generators, the former being the support of R.

Define x < y on G if x\not yx FV(v)yR = v. If there is any unifier p:FG Q then x < y p(x)\prec\prec p(y), by structural induction on v, where \prec \prec is the transitive closure of the sub-expression relation in Q. Since \prec \prec is well founded (Theorems 6.3.13 and 3.8.11(b)) , < must be too (Proposition 2.6.2). In particular the relation < must have no cycles, which, by substitution, would lead to the situation x = u(x).

Since G is finite and < is decidable, to verify well-foundedness it is enough to make the occurs check, for such loops (Exercise  2.36).

By < -recursion, dependent variables may now be eliminated from R using Lemma 6.5.5. Then we have one equation for each generator, which expresses it as a term in F I, the independent variables standing for themselves, so the left-hand triangle commutes:

omitted diagram environment

With a little diagram-chasing we see that FI is the required coequaliser.

Each step of the execution has replaced one unification problem by another which is equivalent to it, the last being an assignment of terms to independent variables, which has a coequaliser. Since the latter is unique up to unique isomorphism, the algorithm is confluent, despite being slightly non- deterministic. []