Hostname: page-component-7bb8b95d7b-2h6rp Total loading time: 0 Render date: 2024-09-27T02:02:10.721Z Has data issue: false hasContentIssue false

A (machine-oriented) logic based on pattern matching

Published online by Cambridge University Press:  05 July 2023

Tim Lethen*
Affiliation:
Department of Philosophy, University of Helsinki, Helsinki, Finland
Rights & Permissions [Opens in a new window]

Abstract

Robinson’s unification algorithm can be identified as the underlying machinery of both C. Meredith’s rule D (condensed detachment) in propositional logic and of the construction of principal types in lambda calculus and combinatory logic. In combinatory logic, it also plays a crucial role in the construction of Meyer, Bunder & Powers’ Fool’s model. This paper now considers pattern matching, the unidirectional variant of unification, as a basis for logical inference, typing, and a very simple and natural model for untyped combinatory logic. An analysis of the new typing scheme will enable us to characterize a large class of terms of combinatory logic which do not change their principal type when being weakly reduced. We also consider the question whether the major or the minor premisse should be used as the fixed pattern.

Type
Paper
Copyright
© The Author(s), 2023. Published by Cambridge University Press

1. Introduction

In propositional logic, rule $\mathbf{D}$ , or condensed detachment as it is often called, is a rule of inference invented by Carew Arthur Meredith in 1957 (cf. Lemmon et al. Reference Lemmon, Meredith, Meredith, Prior, Thomas, Davis, Hockney and Wilson1957, Section 9) as a means to study propositional calculi. As we shall see in Section 2, it combines a “minimal” amount of substitution with the rule of detachment or modus ponens. While condensed detachment, which has been called “Meredith’s most widely used innovation in logic” (Meredith Reference Meredith1977), is often regarded as an abbreviative device in Hilbert-style proofs, it has also widely been used in the field of automated theorem proving, Footnote 1 and it is known that David Meredith, as early as 1957, “programmed UNIVAC I to find the D-derivatives of any given set of formulae, but with only one thousand 12-character words of memory … didn’t get any results that could not have been obtained by hand more quickly.” (Kalman Reference Kalman1983).

A few years later, in 1965, J. A. Robinson’s pioneering and influential paper introducing the unification algorithm appeared (Robinson Reference Robinson1965), and it was soon realized that it was this very algorithm which lay at the heart of Meredith’s rule $\mathbf{D}$ . Footnote 2 Unification was a central part of Robinson’s resolution principle, which can be regarded as a “machine-oriented” rule of inference for first-order logic. Nevertheless, as Robinson (1979, p. 292) states, the idea of unification “turns out to have been sitting there all these years, unnoticed, in Herbrand’s doctoral thesis” (cf. Herbrand 1930, Chapter 5, Section 2.4). Further details about the history of condensed detachment and its connection to Robinson’s unification algorithm can be found in Meredith (Reference Meredith1977), Kalman (Reference Kalman1983), and Hindley (Reference Hindley1997, p. 103).

Another strand of research closely connected to unification is the theory of simple types. Here, the principal type of a term in lambda calculus or combinatory logic can be regarded as the term’s most general type, and it was soon realized that the algorithm designed to compute the principal type of a term is not only based on Robinson’s unification algorithm but also mirrors Meredith’s rule $\mathbf{D}$ of the propositional calculus in a perfect manner. An overview over the history of this apparently independent field of development can be found in Hindley (Reference Hindley1997, p. 33).

Finally, a type-lifted variant of rule $\mathbf{D}$ – and thus the unification algorithm – also lies at the heart of a model of untyped combinatory logic published in unification” Meyer et al. (Reference Meyer, Bunder and Powers1991). It is called the “Fool’s Model.”

In this paper, we now shift the emphasis to an algorithm called pattern matching, which can be regarded as a simplified, “unidirectional variant of unification” unification” (Knight Reference Knight1989), and which is very popular in a wide range of applications. As we shall see, this variant gives rise to a new rule of inference in propositional logic, to a simplified typing scheme in lambda calculus and combinatory logic, and, finally, to a very natural model of combinatory logic.

We proceed as follows. In Section 2, we summarize the basics of condensed detachment, principal types, and the unification algorithm. In Section 3, we introduce a version of pattern matching upon which we will base the new rule of inference and the new typing scheme. As we shall see in Section 4, this typing scheme will lead to the identification of a large class of terms of combinatory logic which do not change their principal type when weakly reduced. In Section 5, we introduce a simple new model for untyped combinatory logic, which we call the pattern model. Finally, Section 6 will consider the question why the introduced logical inference and the typing based on pattern matching take the minor premisse (instead of the major premisse) as its fixed pattern.

2. Condensed Detachment, Principal Types, and Unification

In this section, we summarize some basic definitions and facts about condensed detachment, principal types, and the unification algorithm, which will help to introduce the new rule of inference and the new typing scheme in the section to follow.

2.1 Propositional logic and condensed detachment

In order to facilitate the comparison between condensed detachment and principal types, the considered propositional formulas will be purely implicational. To be more precise, propositional formulas are defined by the simple grammar rule:

(1) \begin{align*} \varphi ::= p\, |\, (\varphi \rightarrow \varphi) \end{align*}

where p ranges over a countable infinite set of propositional variables.

We call the formula $\alpha'$ an alphabetic variant of the formula $\alpha$ if, in order to obtain $\alpha'$ , all or some variables in $\alpha$ have been substituted by variables distinct from each other and distinct from any variables in $\alpha$ which have not been replaced. Footnote 3 A substitution $\sigma$ is called a unifier of the formulas $\alpha$ and $\beta$ if $\sigma(\alpha) = \sigma(\beta)$ . A unifier $\sigma$ of $\alpha$ and $\beta$ is called most general if $\sigma'(\alpha)$ is a substitution instance of $\sigma(\alpha)$ for every other unifier $\sigma'$ of $\alpha$ and $\beta$ . In this case, we call $\sigma(\alpha)$ the most general unification of $\alpha$ and $\beta$ .

The rule $\mathbf{D}$ (condensed detachment) can then be given as:

(D)

where $\gamma'$ is an alphabetic variant of $\gamma$ which has no common variables with $\alpha$ , and where $\sigma$ is a most general unifier of $\alpha$ and $\gamma'$ for which no new variable in $\sigma(\alpha)$ occurs in $\beta$ . Footnote 4 If rule $\mathbf{D}$ can be applied to $\alpha \rightarrow \beta$ and $\gamma$ , we write ${\mathbf{D}(\alpha \rightarrow \beta, \gamma) = \sigma(\beta)}$ . Footnote 5

As an example for an application of rule $\mathbf{D}$ , let the major premise ${\alpha \rightarrow \beta}$ be ${(a \rightarrow (b \rightarrow c)) \rightarrow ((a \rightarrow b) \rightarrow (a \rightarrow c))}$ , and let the minor premise $\gamma$ be the formula ${p \rightarrow p}$ . As $\alpha$ and $\gamma$ share no common variables, we can immediately proceed with their actual unification, which is realized by the most general unifier ${\sigma = \left\lbrace a\leadsto(b \rightarrow c), p\leadsto(b \rightarrow c) \right\rbrace}$ . Thus, the resulting formula $\sigma(\beta)$ is ${((b \rightarrow c) \rightarrow b) \rightarrow ((b \rightarrow c) \rightarrow c)}$ .

The representation of the just mentioned formulas as binary trees leads to the picture in Fig. 1, in which the original trees $\alpha$ and $\gamma$ successfully try to grow into the same overall shape or structure. Note how the subtree $\beta$ follows the same “growing rules” as $\alpha$ and $\gamma$ do. Finally, variables in inner nodes will have to be replaced by the implication operator.

Figure 1. Rule $\mathbf{D}$ applied to the major premise $\alpha \rightarrow \beta = (a \rightarrow (b \rightarrow c)) \rightarrow ((a \rightarrow b) \rightarrow (a \rightarrow c))$ and the minor premise $\gamma = {p \rightarrow p}$ , both represented as binary trees. The resulting formula is $\sigma(\beta) = ((b \rightarrow c) \rightarrow b) \rightarrow ((b \rightarrow c) \rightarrow c)$ .

2.2 Combinatory logic and principal types

Terms of combinatory logic (CL-terms) are built according to the following production rule:

(2) \begin{align*} \varphi ::= \mathsf{S}\, |\, \mathsf{K}\, |\, (\varphi\, \varphi), \end{align*}

the third option being called an application. Example CL-terms are $\mathsf{S}$ , $(\mathsf{K}\mathsf{S})$ , and $((\mathsf{K}\mathsf{K})(\mathsf{K}\mathsf{S}))$ , the last one of which is usually simply written as $\mathsf{K}\mathsf{K}(\mathsf{K}\mathsf{S})$ following an association to the left.

Two rules of weak reduction Footnote 6 can be applied to CL-terms, and while the actual terms can be interpreted as algorithms, it is the reduction rules which bring in the dynamics of the computation. These rules are

(3)
(4)

where the metavariables x, y, and z represent arbitrary CL-terms. Applying these rules, an example reduction leading to what is called an unreducible normal form might look as follows:

The reader should note that we could also have chosen a different reduction path, which, nevertheless, would have led to the same normal form, a property of combinatory logic known as the Church–Rosser property:

\begin{align*} \mathsf{K}(\mathsf{S}\mathsf{K}\mathsf{S}\mathsf{S})\mathsf{K} \rhd \mathsf{S}\mathsf{K}\mathsf{S}\mathsf{S} \rhd \mathsf{K}\mathsf{S}(\mathsf{S}\mathsf{S}) \rhd \mathsf{S} \end{align*}

In typed combinatory logic, principal types can be assigned to certain CL-terms. These types follow the building law given by production rule (1), that is, every type can be interpreted as an implicational proposition. Two axioms assign the principal types ${(a \rightarrow (b \rightarrow c)) \rightarrow ((a \rightarrow b) \rightarrow (a \rightarrow c))}$ and ${a \rightarrow (b \rightarrow a)}$ (or any other alphabetic variant) to the CL-terms $\mathsf{S}$ and $\mathsf{K}$ , respectively. If two CL-terms M and N with principal types ${\varphi \rightarrow \psi}$ and $\vartheta$ are given, the principal type ${\mathbf{D}(\varphi \rightarrow \psi, \vartheta)}$ , if it exists, is assigned to the application $(M\, N)$ . Otherwise, $(M\, N)$ is regarded as not typable. The typing scheme is summarized in Fig. 2.

Figure 2. Assigning principal types to terms in CL.

As an example, the tree in Fig. 3 assigns the principal type ${(a \rightarrow a)}$ to the CL-term $\mathsf{S}\mathsf{K}\mathsf{K}$ .

Figure 3. A tree assigning the principal type ${(a \rightarrow a)}$ to the CL-term $\mathsf{S}\mathsf{K}\mathsf{K}$ , using two applications of rule (pt). The substitution in the first step is $\left\lbrace a \!\leadsto\! p, b\!\leadsto\! q, c\!\leadsto\! p\right\rbrace$ and in the second step $\left\lbrace p \!\leadsto\! a, q\!\leadsto\! (b \rightarrow a)\right\rbrace $ .

2.3 The unification algorithm

Without going into two many details, we now present a version of the unification algorithm which is based on a variant given in Russell and Norvig (Reference Russell and Norvig2010) and which, in our case, computes a most general unifier of two implicational propositional formulas $\alpha$ and $\beta$ . Basically, Algorithm 1 considers four cases: if $\alpha$ and $\beta$ are equal, the substitution built up so far is returned. If one of the formulas is a variable, the two formulas are passed on to Algorithm 2, unify-var, which unifies the variable with the second formula. Otherwise, the left and right parts of $\alpha$ and $\beta$ , that is, their antecedent and consequent, are recursively unified.

Two things are worth noting about this algorithm. First, unification is “nearly” symmetrical in $\alpha$ and $\beta$ , the only difference between $\alpha$ and $\beta$ which breaks the symmetry being that the test if $\alpha$ is a variable comes before the test if $\beta$ is a variable. Thus, in certain cases, unify $(\alpha, \beta)$ may lead to a result different from the one returned by unify $(\beta, \alpha)$ . Footnote 7 However, it can be shown that the resulting unified formulas are always alphabetic variants of each other. Second, the algorithm is apparently far from being trivial: as Peter Norvig notes (Norvig Reference Norvig1991), many textbooks in the 1980s presented Footnote 8 versions of the algorithm which, for example, failed to unify the formulas ${(a \rightarrow (b \rightarrow a)) \rightarrow (b \rightarrow (a\rightarrow b))}$ and ${p \rightarrow p}$ due to a missing dereferencing clause when unifying variables (lines 3/4 in Algorithm 2).

Algorithm 1 unify(α, β, σ)

Algorithm 2 unify-var(var, β, σ)

3. Pattern Matching, Pattern Detachment, and Pattern Typing

As mentioned in the Introduction, we now shift the focus to the unidirectional sibling of the unification algorithm, which is known as pattern matching. Pattern matching regards one of the given formulas as a fixed and unchangable pattern to which the other formula has to be matched using appropriate variable substitutions. Footnote 9 This asymmetry is reflected in Algorithm 3 (match) by the difference between the clause in lines 5/6 on the one hand and the clause in lines 7/8 on the other hand: if $\beta$ is a variable but $\alpha$ is not, the matching is doomed to fail because $\beta$ is not allowed to grow. Furthermore, Algorithm 4 (match-var) fails if a variable var has to be substituted by an expression $\beta$ , but the substitution built so far already contains a binding ${\left\langle var \!\leadsto\! \gamma\right\rangle}$ for which $\beta \neq \gamma$ holds. In this case, there is no possibility to unify $\beta$ and $\gamma$ .

Algorithm 3 match(α, β, σ)

Algorithm 4 match-var(var, β, σ)

While pattern matching is a widely used technique in Artificial Intelligence and related fields, it does not play an apparent role when it comes to logical rules of inference in propositional logic or to type inference in combinatory logic. In what follows, we take a closer look at the consequences of replacing the unification algorithm by pattern matching as the underlying mechanism of these kinds of inferences. To start with, we define an inference rule $\mathbf{P}$ (pattern detachment) as follows:

(P) \begin{align*} \frac{\alpha \rightarrow \beta \qquad \gamma}{\sigma(\beta)} \end{align*}

Here, if it exists, $\sigma$ is the substitution for which $\sigma(\alpha) = \gamma'$ holds, where $\gamma'$ is an alphabetic variant of $\gamma$ which does not share any variables with $\beta$ . Footnote 10 In this case, we write ${\mathbf{P}(\alpha \rightarrow \beta, \gamma) = \sigma(\beta})$ , otherwise ${\mathbf{P}(\alpha \rightarrow \beta, \gamma)}$ remains undefined. The reader should note that if ${\mathbf{P}(\alpha \rightarrow \beta, \gamma)}$ is defined, then also ${\mathbf{D}(\alpha \rightarrow \beta, \gamma)}$ (condensed detachment) is defined and ${\mathbf{P}(\alpha \rightarrow \beta, \gamma)} = {\mathbf{D}(\alpha \rightarrow \beta, \gamma)}$ holds.

An example of pattern detachment is shown in Fig. 4, where the formulas ${(a \rightarrow (b \rightarrow c)) \rightarrow (b \rightarrow (a \rightarrow c))}$ and ${(p \rightarrow q) \rightarrow (p \rightarrow q)}$ , both represented as binary trees, lead to the result ${p \rightarrow ((p \rightarrow q) \rightarrow q)}$ , the (most general) unifier being the substitution ${\sigma = \left\lbrace a \leadsto (p \rightarrow q), b \leadsto p, c \leadsto q \right\rbrace }$ .

Figure 4. Rule $\mathbf{P}$ applied to the major premise $\alpha \rightarrow \beta = (a \rightarrow (b \rightarrow c)) \rightarrow (b \rightarrow (a \rightarrow c))$ and the minor premise $\gamma = {(p \rightarrow q) \rightarrow (p \rightarrow q)}$ , both represented as binary trees. The resulting formula is $\sigma(\beta) = p \rightarrow ((p \rightarrow q) \rightarrow q)$ . Note that the minor premise $\gamma$ remains in its fixed shape.

When it comes to typed combinatory logic, the notion of pattern typing can be defined exactly as in Fig. 2, the third rule (pt) being replaced by the following rule

If ${\mathbf{P}(\varphi \rightarrow \psi, \vartheta)}$ is not defined, then the CL-term (M N) is not pattern typable. If a CL-term Q is pattern typable, we use the notation ptt(Q) for its pattern type.

As an example for pattern typing, we refer the reader back to Fig. 3, where we simply replace the annotation (pt) by (ptt), stressing the fact that in this example both applications of rule (ptt) regard the type of the combinator $\mathsf{K}$ (the minor premise) as a fixed pattern, to which the antecedent of the types of the combinators $\mathsf{S}$ and $\mathsf{S}\mathsf{K}$ have to be matched. The example demonstrates that every pattern type also is a principal type of its CL-term.

4. Pattern Types and Weak Reduction

It is a widely known fact that weak reduction is capable to change the principal type of a CL-term. To be more precise, if $\varphi$ is the principal type of a CL-term M which weakly reduces to N (i.e., $M \rhd N$ ), then N has a principal type $\psi$ of which $\varphi$ is a substitution instance. Footnote 11 As an example, consider the CL-term $\mathsf{S}\mathsf{K}\mathsf{S}\mathsf{I}$ (where $\mathsf{I}$ abbreviates the term $\mathsf{S}\mathsf{K}\mathsf{K}$ ). This term reduces to $\mathsf{K}\mathsf{I}(\mathsf{S}\mathsf{I})$ . But while the former term has the principal type ${(a \rightarrow b) \rightarrow (a \rightarrow b)}$ , the latter has the principal type ${a \rightarrow a}$ . If we consider CL-terms as programs and a reduction step as part of a computation, then the possible change of the program’s principal type during a computation may well disturb our picture of a type as a way to characterize the input and output of the computation. At this very point, it should be remarkable to note that the pattern type of a CL-term never changes when the term is reduced.

Theorem 1. Let M and N be two CL-terms with ${M \rhd N}$ . If M has pattern type $\varphi$ , then N is also pattern typable and has the same pattern type $\varphi$ .

Proof. We consider three cases. In case 1, M is the CL-term $\mathsf{S} xyz$ , x, y, and z denoting arbitrary CL-terms. If $\mathsf{S} xyz$ is pattern typable, x needs to have a pattern type of the structure ${\sigma_1(a \rightarrow (b \rightarrow c))}$ so that the type of $\mathsf{S}$ can match this fixed pattern in the first step:

Similar arguments lead us to the structure of the types of y and z, and finally to the pattern type of $\mathsf{S} xyz$ :

As $\mathsf{S} xyz$ reduces to xz(yz), we can now construct the pattern type of the reduct, taking into consideration the appropriate structures of the types of x, y, and z. As we can see, the deduction results in the same pattern type as before:

In case 2, M is the CL-term $\mathsf{K} xy$ , with x and y again denoting CL-terms. For $\mathsf{K} xy$ to be pattern typable, x and y may now have arbitrary pattern types, which we denote as $\sigma_1(a)$ and $\sigma_2(b)$ , respectively. Here, we stipulate that

$\sigma_1$ does not replace any other variables than a,

$\sigma_1(a)$ does not contain the variable b,

$\sigma_2$ does not replace any other variables than b,

which, for example, implies that $\sigma_1(b) = b$ , a fact which is used in the following deduction:

Due to the stipulations concerning the substitutions $\sigma_1$ and $\sigma_2$ , the resulting pattern type is equal to $\sigma_1(a)$ , which is the pattern type of x, which in turn is the reduct of $\mathsf{K} xy$ .

In case 3, M contains one of the redexes $\mathsf{S} xyz$ and $\mathsf{K} xy$ as a proper subterm, which is reduced to obtain the term N. Using structural induction, this case can easily be reduced to the first two cases.

As every pattern type is also a principal type, Theorem 1 identifies a large class of CL-terms which do not change their principal type when being reduced.

At this point, the question may arise how “natural” Theorem 1 is, or, to put it the other way round, one may wonder if the unchanged types in the above proof are just a mere coincidence. A look at other combinators presents a somewhat diffuse picture at first sight: while the combinators $\mathsf{B}$ and $\mathsf{W}$ with reduction rules $\mathsf{B} xyz \rhd x(yz)$ and $\mathsf{W} xy \rhd xyy$ , respectively, preserve pattern types, the combinator $\mathsf{C}$ with reduction rule $\mathsf{C} xyz \rhd xzy$ and pattern type ${(a \rightarrow (b \rightarrow c)) \rightarrow (b \rightarrow (a \rightarrow c))}$ reveals a somewhat different behavior. Again, we can first determine the types of the terms x, y, and z as follows:

Trying to pattern type the redex xzy then leads to the insight that a pattern type might not exist if $\sigma_3$ is a substitution that modifies the type ${\sigma_2 \circ \sigma_1(b \rightarrow c)}$ :

As an example, we can now consider the term $\mathsf{C}\mathsf{T}\mathsf{I}\mathsf{K}$ with pattern type ${(a \rightarrow (b \rightarrow a))}$ , where $\mathsf{T}$ abbreviates the term $\mathsf{C}(\mathsf{S}\mathsf{K}(\mathsf{S}\mathsf{K}))$ with pattern type ${(a \rightarrow ((a \rightarrow b) \rightarrow b))}$ . It is now easy to see that the reduct $\mathsf{T}\mathsf{K}\mathsf{I}$ is not pattern typable. Footnote 12

The following theorem characterizes exactly those combinators which guarantee that reduction does preserve pattern types. If M is a combination of terms taken from the set $\left\lbrace x_1, \dots, x_n \right\rbrace$ , index(M) denotes the greatest x-index occuring within M. For example, $index(x_1 x_3 (x_2 x_3)) = 3$ .

Theorem 2. Let P be a combinator with reduction rule $P x_1, \dots, x_n \rhd Q$ where Q is a combination of the variables $x_1, \dots, x_n$ . Then, for any CL-terms $x_1, \dots, x_n$ , Q is pattern typable with the same pattern type as $P x_1, \dots, x_n$ , if, for every subterm (M N) of Q, the inequation $index(M) \leq index(N)$ holds.

Before we move on to the Theorem’s proof, two examples may help to clarify the central point. First, the just regarded combinator $\mathsf{C}$ with reduction rule $\mathsf{C} x_1 x_2 x_3 \rhd x_1 x_3 x_2$ violates the subterm condition in the term $((x_1 x_3) x_2)$ because $index{(x_1 x_3)} > index{(x_2)}$ . Second, the combinator $\mathsf{B}'$ with reduction rule $\mathsf{B}'x_1 x_2 x_3 \rhd x_2(x_1 x_3)$ does indeed guarantee the preservation of pattern types, as can now immediately be read off the indices of the structure $(x_2(x_1 x_3))$ . We now proceed to the proof of Theorem 2.

Proof. In the first step, pattern typing of the term $Px_1 \dots x_n$ leads to the following deduction, as can be shown by natural induction on n. Thus, every term $x_i$ ( ${1 \leq i \leq n}$ ) has pattern type ${\sigma_i \sigma_{i-1} \dots \sigma_1(\varphi_i)}$ for some type $\varphi_i$ : Footnote 13

In the second step, we now consider subterms (MN) of Q with $\underbrace{index(M)}_{=: k} \leq \underbrace{index(N)}_{=: l}$ . By structural induction, we can show that (MN) has pattern type ${\sigma_l \dots \sigma_1(\psi)}$ for some type $\psi$ . As the base case, we consider the subterm $(x_k x_l)$ for two variables $x_k: \sigma_k \dots \sigma_1(\varphi_k)$ and $x_l: \sigma_l \dots \sigma_1(\varphi_l)$ . As $k \leq l$ holds by the assumption of the theorem, we get $(x_k x_l): \sigma_l \dots \sigma_1(\psi)$ . For the induction step, we consider two terms M and N, which, by the induction hypothesis, have pattern types $\sigma_k \dots \sigma_1(\varphi_M)$ and $\sigma_l \dots \sigma_1(\varphi_N)$ , respectively. Again, $k \leq l$ holds, and we conclude that the term (MN) has pattern type $\sigma_l \dots \sigma_1(\psi)$ . In the case of the complete term Q, this is exactly the type of the term $Px_1 \dots x_n$ .

5. The Pattern Model of Untyped Combinatory Logic

Theorem 1 of the preceding section now gives rise to a model of untyped combinatory logic which can be regarded as a simplification of the Fool’s model (Meyer et al. Reference Meyer, Bunder and Powers1991). To start with, we define the notion of a combinatory reduction algebra.

Definition 1 . A combinatory reduction algebra is an algebraic structure ${(\mathcal{A}, \leq, \bullet, s, k)}$ where $\mathcal{A}$ (the domain) is a nonempty set, $\leq$ is a partial order on $\mathcal{A}$ , $\bullet\! : \mathcal{A} \times \mathcal{A} \rightarrow \mathcal{A}$ is a binary operation on $\mathcal{A}$ , and s and k are two distinct elements of $\mathcal{A}$ , and where the following stipulations hold for all $x,y,z \in \mathcal{A}$ :

(5)
(6)
(7)
(8)

Next, we interpret CL-terms (the set of which we denote by $\mathbb{T}$ ) by means of the elements of a combinatory reduction algebra.

Definition 2. A model of untyped combinatory logic is a pair $(\mathfrak{A}, \mathfrak{I})$ where ${\mathfrak{A} = (\mathcal{A}, \leq, \bullet, s, k)}$ is a combinatory reduction algebra, and ${\mathfrak{I}\!: \mathbb{T} \rightarrow \mathcal{A}}$ is a mapping for which the following equalities hold:

(9)
(10)
(11)

In order to introduce the new model, which we will call the pattern model, we first identify propositional formulas which are alphabetic variants of each other. To this end, we use the notation $\left\lbrace \varphi \right\rbrace_\alpha$ to denote the set of all formulas which are alphabetic variants of $\varphi$ . These kind of sets actually form the domain of the pattern model.

Definition 3. The pattern model is the pair $(\mathfrak{P}, \mathfrak{I})$ , where

(12)
(13)
(14)
(15)
(16)
(17)
(18)

A few remarks should be appropriate at this point.

  • As defined in (13), the domain $\mathcal{P}$ of $\mathfrak{P}$ can simply be regarded as the set of implicational propositional formulas modulo alphabetic variation. $\mathcal{P}$ thus consists of equivalence classes of formulas which have the same “logical power.” Footnote 14

  • If x and y are elements of the domain $\mathcal{P}$ , $x \subseteq y$ only holds if $x = y$ or $x = \emptyset$ .

  • As defined in (15), $x = \emptyset$ or $y = \emptyset$ immediately lead to the fact that $x \bullet y = \emptyset$ . This corresponds to the fact that any CL-term, which has an pattern-untypable subterm, is itself pattern-untypable.

  • If both $x \neq \emptyset$ and $y \neq \emptyset$ , then $x \bullet y$ is the result of (possibly) applying rule $\mathbf{P}$ (pattern detachment) to two representatives of the sets x and y, finally collecting the alphabetic variants of the result. Footnote 15

  • Concerning the role of the empty set $\emptyset$ , computer experiments have been able to show that 310,210 out of 397,242 terms – with a maximum of eight applications – are indeed pattern typable and are thus not represented by the empty set.

Theorem 3. The pattern model is a model of untyped combinatory logic.

Proof. We first show that for every pattern model $(\mathfrak{P},\mathfrak{I})$ , $\mathfrak{P}$ is a combinatory reduction algebra. To this end, we show that stipulations (5) and (7) hold. Stipulations (6) and (8) follow in exactly the same manner.

For stipulation (5), we consider two cases. In the first case, the expression $s \bullet x \bullet y \bullet z$ is equal to the empty set. In this case, set inclusion is trivially fulfilled. Otherwise, one can successfully apply rule $\mathbf{P}$ three times to representatives of the sets s, x, y, and z, yielding a nonempty set of alphabetic variants of a formula $\vartheta$ . But now Theorem 1 guarantees that the expression $x \bullet z \bullet (y \bullet z)$ leads to exactly the same result. Again, set inclusion holds.

For stipulation (7), we also consider two cases. In the first case, $x \subseteq y$ holds because x is the empty set. But in this case, the expression $z \bullet x$ is also equal to the empty set, and thus $z \bullet x \subseteq z \bullet y$ is fulfilled. In the second case, $x \subseteq y$ holds because x and y are equal. But then $z \bullet x$ and $z \bullet y$ are also equal.

In the next step, we show that the interpretaion $\mathfrak{I}$ does indeed give rise to a model. As equalities (9) and (10) obviously hold, we can concentrate on equality (11). This time, we consider three cases. In case 1, at least one of the CL-terms P and Q is not pattern typable. This implies that the application (PQ) is not pattern typable, either. Thus, both sides of the equation evaluate to the empty set. In case 2, both P and Q are pattern typable, but the application (PQ) is not. Again, both sides of the equation yield the empty set, the right-hand side because the sets $\mathfrak{I}(P)$ and $\mathfrak{I}(Q)$ do not contain any formula to which rule $\mathbf{P}$ can be applied. In the third case, P, Q, and (PQ) are pattern typable. Thus, the term $\mathfrak{I}(P) \bullet \mathfrak{I}(Q)$ evaluates exactly to the set of alphabetic variants of the pattern type of the application (PQ).

We conclude this section with a short comparison between the pattern model and the Fool’s model. Details about the latter can be found in Meyer et al. (Reference Meyer, Bunder and Powers1991) and Bimbo (2011, Chapter 6). The Fool’s model can actually be obtained by simply changing Definition 3 as follows:

  1. (1) Change the index $\alpha$ to an index s all the way through, where the notation $\left\lbrace \varphi \right\rbrace_s $ represents the set of all substitution instances of $\varphi$ .

  2. (2) In (18), change “ptt” and “pattern typable” into “pt” and “principal-typable,” respectively.

The advantages of the pattern model over the Fool’s model now lie at hand: First, the former is based on the very natural domain consisting of propositional formulas (modulo the renaming of variables). Second, two arbitrary CL-terms P and Q with $P \rhd Q$ are always represented by the same object, thus supporting the idea that a computation is not supposed to change its input and output types while running.

6. Major versus Minor Premisses as Fixed Patterns

In this last section, we answer the question why we have decided to consider the minor premise as the fixed and unchangable pattern both in rule $\mathbf{P}$ and in rule (ptt). The main reason is that any proof which also allows for the consideration of the major premise as the fixed pattern can – under certain very “mild” conditions – be transformed into an equivalent proof which uses rule $\mathbf{P}$ only. To be more precise, we first define the following rule $\mathbf{P}'$ , which is applicable if and only if there is a substitution $\sigma$ for which $\sigma(\gamma) = \alpha$ holds:

(P′) \begin{align*} \frac{\alpha \rightarrow \beta \qquad \gamma}{\beta} \end{align*}

Using the notation $\mathbb{A} \vdash_{\mathbf{P}} \varphi$ ( $\mathbb{A} \vdash_{\mathbf{P},\mathbf{P}'} \varphi$ ) for the existence of a proof for $\varphi$ which is based on the axioms $\mathbb{A}$ and rule $\mathbf{P}$ (and rule $\mathbf{P}'$ , resp.), we can now state the following theorem.

Theorem 4. If $\mathbb{A} \vdash_{\mathbf{P},\mathbf{P}'} \varphi$ and $\mathbb{A} \vdash_{\mathbf{P}} (a \rightarrow ((a \rightarrow b) \rightarrow a))$ , then $\mathbb{A} \vdash_{\mathbf{P}} \varphi$ (for all propositional implicational formulas $\varphi$ ).

Proof. All we need to do is to show how an application of rule $\mathbf{P}'$ in a proof tree can be replaced by applications of rule $\mathbf{P}$ which possibly make use of the formula ${a \rightarrow ((a \rightarrow b) \rightarrow a)}$ . Let us assume that a proof contains the following passage where $\sigma(\gamma) = \alpha$ holds for some substitution σ:

We replace this passage by the following structure:

For this structure, we assume that b is a fresh variable with respect to $\alpha, \beta, \gamma$ , and the substitution $\sigma$ . $\sigma'$ is the substitution $\sigma$ , extended by the substitution $\left\lbrace b \!\leadsto\! \beta \right\rbrace $ , that is, ${\sigma' = \sigma \cup \left\lbrace b \!\leadsto\! \beta \right\rbrace }$ . As b does not occur in $\gamma$ , ${\sigma'(\gamma) = \sigma(\gamma)}$ holds.

We finish our considerations with some remarks and open questions concerning the interplay between rules $\mathbf{P}$ and $\mathbf{P}'$ in the presence of the two axioms $(a \rightarrow (b \rightarrow c)) \rightarrow ((a \rightarrow b) \rightarrow (a \rightarrow c))$ and ${a \rightarrow (b \rightarrow a)}$ , which, on the one hand, form a basis for the implicational fragment of intuitionistic propositional logic, and, on the other hand, serve as types for combinatory logic based on the combinators $\mathsf{S}$ and $\mathsf{K}$ .

  • It remains an open question whether the formula $a \rightarrow ((a \rightarrow b) \rightarrow a)$ , which plays such a crucial role in the elimination of rule $\mathbf{P}'$ , can be pattern-proved using the two just mentioned axioms. As computer experiments have shown, none of the CL-terms with less than eight combinators have pattern type $a \rightarrow ((a \rightarrow b) \rightarrow a)$ .

  • If the formula $(a \rightarrow (b \rightarrow c)) \rightarrow (b \rightarrow (a \rightarrow c))$ , which is the principal type of the combinator $\mathsf{C}$ , is added as an axiom, the formula $a \rightarrow ((a \rightarrow b) \rightarrow a)$ can be pattern-proved. It is the pattern type of the CL-term $\mathsf{C} \mathsf{S} \mathsf{K}$ . It remains an open question whether the formula $(a \rightarrow (b \rightarrow c)) \rightarrow (b \rightarrow (a \rightarrow c))$ is a pattern type for the basis $\mathsf{S}$ and $\mathsf{K}$ . (Again, there is no CL-term with less than eight combinators which serves our purpose.)

  • It also remains open whether the combination of rules $\mathbf{P}$ and $\mathbf{P}'$ can replace rule $\mathbf{D}$ , condensed detachment.

Acknowledgments

The research for this article is a part of the Gödeliana project led by Jan von Plato, to whom I remain grateful for his encouraging support. Also, I would like to thank Anna Maiworm and Isabelle Sauer for the fruitful discussions which finally lead to the discovery of Theorems 1 and 3. Finally, I thank the anonymous referees for their careful reading of the submitted paper. The project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement No. 787758) and from the Academy of Finland (Decision No. 318066).

Footnotes

1 For an early publication of the underlying algorithm, see Peterson (Reference Peterson1978).

2 According to Hindley (Reference Hindley1997, p. 103), it was “around 1978, when David Meredith was the first to realize the parallel.”

3 To give an example, the formulas ${(a \rightarrow (b \rightarrow a))}$ , ${(p \rightarrow (q \rightarrow p))}$ , and ${(c \rightarrow (a \rightarrow c))}$ are alphabetic variants of each other.

4 That is, $(Vars(\sigma(\alpha)) \setminus Vars(\alpha)) \cap Vars(\beta) = \emptyset$ .

5 For an alternative (but equivalent) definition of condensed detachment, see Bunder (Reference Bunder1995).

6 As we only consider weak reduction (as opposed to strong reduction) in this paper, we will simply speak of reduction.

7 For example, unify ${(a \rightarrow b, b \rightarrow a)}$ returns the substitution $\left\lbrace \left\langle a \!\leadsto\! b \right\rangle\right\rbrace$ , while unify ${(b \rightarrow a, a \rightarrow b)}$ returns $\left\lbrace \left\langle b \!\leadsto\! a \right\rangle\right\rbrace$ .

8 See for example Abelson and Sussman (Reference Abelson and Sussman1984). The mistake has been corrected in the second edition.

9 In our case, the minor premise constitutes the fixed pattern to which the major premise is matched. This seemingly arbitrary choice is taken up again in Section 6.

10 Producing an alphabetic variant is needed in order to avoid unwanted variable capture. As an example, the reader may want to apply rule $\mathbf{P}$ to the premisses $a \rightarrow (b \rightarrow a)$ and $b \rightarrow b$ without renaming the variable b.

11 This is a direct consequence of the subject-reduction theorem, see for example Hindley and Seldin (1957, p. 132) or Bimbo (2011, p. 248).

12 I am indebted to the third anonymous referee, who gave the decisive hint to this example and especially to the “Thrush” combinator $\mathsf{T}$ .

13 For brevity, we omit the circle operator in the composition of substitutions.

14 Concerning this logical power and the role of the variables, Schönfinkel (Reference Schönfinkel1924, p. 307) writes: “The variable in the logical proposition is nothing more than a marker to indicate certain argument places and operators as belonging together, and thus has the character of a mere auxiliary item that is actually inappropriate to the constant ‘eternal’ nature of the logical proposition.”

15 The reader should note that we could dispense with the index “ $\alpha$ ” in definition (15): As x and y contain every alphabetic variant, definition (15) automatically “produces” every alphabetic variant in the resulting set.

References

Abelson, H. and Sussman, G. J. (1984). Structure and Interpretation of Computer Programs, The MIT Press, Cambridge, MA.Google Scholar
Bimbó, K. (2011). Combinatory Logic: Pure, Applied and Typed, CRC Press, Boca Raton.CrossRefGoogle Scholar
Bunder, M. W. (1995). A simplified form of condensed detachment. Journal of Logic, Language, and Information 4 (2) 169173.CrossRefGoogle Scholar
Herbrand, J. (1930). Recherches sur la théorie de la démonstration. Phd dissertation, University of Paris. Reprinted in: J. Herbrand, Écrits logiques, Paris: Presses Universitaires de France 1968, 35–153. English translation of Chapter 5 in: van Heijenoort, J. (ed.) From Frege to Gödel: a Source Book in Mathematical Logic, 1879–1931. Harvard University Press, Cambridge, 1967, 529567.Google Scholar
Hindley, J. R. (1997). Basic Simple Type Theory, Cambridge Tracts in Theoretical Computer Science, Cambridge University Press, Cambridge.Google Scholar
Hindley, J. R. and Seldin, P. (2008). Lambda-Calculus and Combinators – an Introduction, Cambridge University Press, Cambridge.CrossRefGoogle Scholar
Kalman, J. A. (1983). Condensed detachment as a rule of inference. Studia Logica 42 (4) 443451.CrossRefGoogle Scholar
Knight, K. (1989). Unification: A multidisciplinary survey. ACM Computing Surveys (CSUR) 21 (1) 93124.CrossRefGoogle Scholar
Lemmon, E. J., Meredith, C. A., Meredith, D., Prior, A. N. and Thomas, I. (1957). Calculi of pure strict implication . Christchurch: Canterbury University College. Reprinted in: Davis, J. W., Hockney, D. J. and Wilson, W. K. (eds.) Philosophical Logic, Reidel, Dordrecht, 1969, 215250.Google Scholar
Meredith, D. (1977). In memoriam: Carew Arthur Meredith (1904–1976). Notre Dame Journal of Formal Logic 18 (4) 513516.CrossRefGoogle Scholar
Meyer, R. K., Bunder, M. W. and Powers, L. (1991). Implementing the ‘Fool’s model’ of combinatory logic. Journal of Automated Reasoning 7 (4) 597630.CrossRefGoogle Scholar
Norvig, P. (1991). Correcting a widespread error in unification algorithms. Software: Practice and Experience 21 (2) 231233.Google Scholar
Peterson, J. G. (1978). An automatic theorem prover for substitution and detachment systems. Notre Dame Journal of Formal Logic 19 (1) 119122.CrossRefGoogle Scholar
Robinson, J. A. (1965). A machine-oriented logic based on the resolution principle. Journal of the ACM (JACM) 12 (1) 2341.CrossRefGoogle Scholar
Robinson, J. A. (1979). Logic: Form and Function – The Mechanization of Deductive Reasoning, Edinburgh University Press, Edinburgh.Google Scholar
Russell, S. and Norvig, P. (2010). Artificial Intelligence: A Modern Approach , Prentice Hall Series in Artificial Intelligence, 3rd ed., Pearson, London.Google Scholar
Schönfinkel, M. (1924). Über die Bausteine der mathematischen Logik. Mathematische Annalen 92 (3) 305316.CrossRefGoogle Scholar
Figure 0

Figure 1. Rule $\mathbf{D}$ applied to the major premise $\alpha \rightarrow \beta = (a \rightarrow (b \rightarrow c)) \rightarrow ((a \rightarrow b) \rightarrow (a \rightarrow c))$ and the minor premise $\gamma = {p \rightarrow p}$, both represented as binary trees. The resulting formula is $\sigma(\beta) = ((b \rightarrow c) \rightarrow b) \rightarrow ((b \rightarrow c) \rightarrow c)$.

Figure 1

Figure 2. Assigning principal types to terms in CL.

Figure 2

Figure 3. A tree assigning the principal type ${(a \rightarrow a)}$ to the CL-term $\mathsf{S}\mathsf{K}\mathsf{K}$, using two applications of rule (pt). The substitution in the first step is $\left\lbrace a \!\leadsto\! p, b\!\leadsto\! q, c\!\leadsto\! p\right\rbrace$ and in the second step $\left\lbrace p \!\leadsto\! a, q\!\leadsto\! (b \rightarrow a)\right\rbrace $.

Figure 3

Algorithm 1 unify(α, β, σ)

Figure 4

Algorithm 2 unify-var(var, β, σ)

Figure 5

Algorithm 3 match(α, β, σ)

Figure 6

Algorithm 4 match-var(var, β, σ)

Figure 7

Figure 4. Rule $\mathbf{P}$ applied to the major premise $\alpha \rightarrow \beta = (a \rightarrow (b \rightarrow c)) \rightarrow (b \rightarrow (a \rightarrow c))$ and the minor premise $\gamma = {(p \rightarrow q) \rightarrow (p \rightarrow q)}$, both represented as binary trees. The resulting formula is $\sigma(\beta) = p \rightarrow ((p \rightarrow q) \rightarrow q)$. Note that the minor premise $\gamma$ remains in its fixed shape.