G. P. Huet, "A Unification Algorithm for Typed Lambda-Calculus." Theoretical Computer Science 1 (1975), 27-57.

The problem of finding MGU's of first-order terms was solved independently by Guard [1964] and Robinson [1965]. The higher-order case was investigated by Guard [1964] and Gould [1966]. Gould noted that some cases require infinite sets of unifiers. Incomplete algorithms were given by Darlington [1971, 1973] for second-order logic and Ernst [1971] for omega-order logic. Pietrzykowski [1971] gave a complete enumeration procedure from second-order logic which Jensen and Pietrzykowski [1973] extended to omega-order logic. In this algorithm, unifiers are computed by five rules: elimination, iteration, projection, imitation and identification.

Huet [1973] and Lucchesi [1972] showed third-order (hence omega-order) unification is undecidable. The algorithm in the present paper searches for unifiers, but not most general unifiers, and uses only the two rules: imitation and projection. There is no redundancy in the search. This is in contrast to the fact [Huet 1973] that any search for most general unifiers must be redundant.

This unification algorithm can be used with Huet's "constrained" resolution system [1972, 1973] for omega-order logic.

The language is the typed lambda calculus portion of Church's simple type theory, with beta-conversion but not eta-conversion. (The paper later does briefly consider the simplifications in the algorithm when eta-conversion is assumed.) Heads can be rigid or flexible.

Given two terms of the same type, we construct a matching tree for the two. First, we pair them and make the singleton set of the pair. This is the initial disagreement set. In general a disagreement set is a set of pairs of terms of the same type. A disagreement set is reduced if all the heads of the first of the pairs are flexible and at least one of the heads of the second of the pairs is rigid. A matching tree is a tree where the nodes are either terminal (success or failure) or nonterminal (with a reduced disagreement set) with the arcs connecting the nodes labeled with substitution pairs. The matching tree is constructed by sending the disagreement set to SIMPL which returns a node (either Success, Failure, or a reduced disagreement set). If the node is nonterminal, we choose a pair from the disagreement set and MATCH constructs a finite set of substitution pairs and grows an arc for each one. Performing the substitutions gives new disagreement sets which are sent to SIMPL to construct the new nodes at the ends of the arcs.

The procedure SIMPL decomposes rigid-rigid pairs (failing if the heads are different or the initial abstractors have different arities), reverses rigid-flexible pairs to be flexible-rigid, and succeeds if all pairs are flexible-flexible [so, the success nodes of a matching tree gives preunifiers]. SIMPL is proven correct, in the sense that if it fails, then there are no unifiers, if it succeeds, then there is a unifier [all flex-flex pair sets are unifiable], and if it returns a new disagreement set, then the unifiers of the new set coincide with the unifiers of the old set.

The procedure MATCH constructs projection and imitation substitutions in the usual way. MATCH is proven correct, in the sense that if rho is a unifier for e1 (flexible) and e2 (rigid), then MATCH returns a set of substitutions, one of which is more general than rho. The proof uses a complexity measure on substitutions. The matching tree may grow forever (one example is shown in which there are two disagreement pairs, one of which is flex-flex and the other is flex-rigid, but each substitution pair switches which is which).

The [pre]unification process is proven correct. First, soundness is proven, i.e., if the matching tree contains a success node, then there is a unifier. In fact, such a unifier is given by composing the substitutions down the path to the success node with a unifier for the flex-flex set at the success node. Next, completeness is proven, i.e., if there is a unifier, then every matching tree ["every" in the sense that the choice of which pair to send to MATCH is nondeterministic] contains a success node at a finite level. (He notes that the algorithm given above may construct an infinite tree if there are no unifiers, even in the first-order case.)

A complete set of unifiers (CSU) is a set of unifiers so that every other unifier is more specific than at least one of them [contrast this to a complete set of most general unifiers]. The collection of substitutions given by all success nodes enumerated by a complete matching tree (where we ensure all new variables are new for the path, not just for the terms) does not, in general, give a CSU because we do not solve flex-flex pairs. These substitutions give "initial segments" of unifiers. To enumerate a CSU for a set of flex-flex pairs we would need to use rules similar to "Pietrzykowski's elimination, iteration, and identification."

Huet [1973] proved that any search enumerating a CSU must be redundant, in the sense that on some cases it must produce two unifiers one of which is more general than the other. The search for the existence of a unifier [preunification] need not be redundant. In fact, we can slightly modify the MATCH algorithm to only produce substitution pairs of "maximum" arity (as used in the terms) (e.g., f -> lambda u . A(h(u)) instead of f -> A).

With this modified algorithm, we have the notion of a "reduced complete matching tree" which yields incompatible substitutions down each different path.

The moral is that all the redundancy in the search for a CSU can be pushed into the flexible-flexible cases.

If we assume eta-conversion, then we consider the eta-normal form of a term as the "long" form. The MATCH algorithm is simplified since certain cases collapse. The result is that MATCH can generate at most one imitation substitution pair and at most p1 projection substitution pairs (as compared to without eta in which there could be min(p1,p2)+1 imitation cases and p1(p1+2n+1)/2 projection cases). [Here p1 and p2 are the length of the number of arguments of each term and n is the length of the abstractor sequences.] Clearly, the branching factor of the matching tree is reduced. The resulting algorithm is sound, complete, and non-redundant (in the sense that different paths produce incompatible substitutions).

The algorithm can be improved using some heuristics. First, we can choose how to select the pair to be given to MATCH. Some pairs will lead quickly to a failure while others may lead to an infinite path. A first-in first-out approach avoids delaying a failure forever. We could build in checks to recognize a cycle on a path, but we cannot expect to recognize every infinite branch since higher order unification is undecidable. Finally, there is the fixed point problem of unifying x with a term e which may contain free occurrences of x. Of course, in first-order this case is ruled out by occurs-check. In higher order solutions to the problem correspond to fixed points of the function "lambda x e." As in the first-order case, if x is not free in e, then the most general unifier is <x,e>. The notion of a rigid path for x is defined and we have that if there is no rigid path for x in e, then e and x are unifiable. The converse fails. That is, there may be a rigid path for x in e where x and e are unifiable. We do have two special cases, however. If e has an empty binder and there is a rigid path for x in e, then x and e are not unifiable. If some occurrence of x on a rigid path has no arguments, then x and e are not unifiable (this includes the first-order case).

One of the important properties of the unification algorithm given here is that it checks for unifiability while ignoring the "don't care" flex-flex cases (which are the most difficult to solve). This suggests we should use the algorithm with theorem proving techniques that are based on unifiability (the existence of unifiers) instead of most general unifiers.