The Calculi of Lambda Conversion

by Alonzo Church

Following are the first few pages of the book on the λ-calculus by Alonzo Church (1941). Html transcription adapted and extended from here.

1. The Concept of a Function

Underlying the formal calculi which we shall develop is the concept of a function, as it appears in various branches of mathematics, either under that name or under one of the synonymous names, “operation” or “transformation.” The study of the general properties of functions, independently of their appearance in any particular mathematical (or other) domain, belongs to formal logic or lies on the boundary line between logic and mathematics. This study is the original motivation for the calculi — but they are so formulated that it is possible to abstract from the intended meaning and regard them merely as formal systems.

A function is a rule of correspondence by which when anything is given (as argument) another thing (the value of the function for that argument) may be obtained. That is, a function is an operation which may be applied on one thing (the argument) to yield another thing (the value of the function). It is not, however, required that the operation shall necessarily be applicable to everything whatsoever; but for each function there is a class, or range, of possible arguments — the class of things to which the operation is significantly applicable — and this we shall call the range of arguments, or range of the independent variable, for that function. The class of all values of the function, obtained by taking all possible auguments, will be called the range of values, or range of the dependent variable.

If f denotes a particular function, we shall use the notation (fa) for the value of the function f for the argument a. If a does not belong to the range of arguments of f, the notation (fa) shall be meaningless.

It is, of course, not excluded that the range of arguments or range of values of a function should consist wholly or partly of functions. The derivative, as this notion appears in the elementary differential calculus, is a familiar mathematical example of a function for which both ranges consist of functions. Or, turning to the integral calculus, if in the expression ∫₀¹(fx)dx we take the function f as independent variable, we are led to a function for which the range of arguments consists of functions and the range of values, of numbers. Formal logic provides other examples; thus the existential quantifier, according to the present account, is a function for which the range of arguments consists of propositional functions, and the range of values consists of truth values.

In particular it is not excluded that one of the elements of the range of arguments of a function f should be the function f itself. This possibility has frequently been denied, and indeed, if a function is defined as a correspondence between two previously given ranges, the reason for the denial is clear. Here, however, we regard the operation or rule of correspondence, which constitutes the function, as being first given, and the range of arguments then determined as consisting of the things to which the operation is applicable. This is a departure from the point of view usual in mathematics, but it is a departure which is natural in passing from consideration of functions in a special domain to the consideration of function in general, and it finds support in consistency theorems which will be proved below.

The identity function I is defined by the rule that (Ix) is x, whatever x may be; then in particular (II) is I. If a function H is defined by the rule that (Hx) is I, whatever x may be, then in particular (HH) is I. If Σ is the existential quantifier, then (ΣΣ) is the truth-value truth.

The functions I and H may also be cited as examples of functions for which the range of arguments consists of all things whatsoever.

2. Extension and Intension

The foregoing discussion leaves it undetermined under what circumstances two functions shall be considered the same.

The most immediate and, from some points of view, the best way to settle this question is to specify that two functions f and g are the same if they have the same range of arguments and, for every element a that belongs to this range, (fa) is the same as (ga). When this is done we shall say that we are dealing with functions in extension.

It is possible, however, to allow two functions to be different on the ground that the rule of correspondence is different in meaning in the two cases although always yielding the same result when applied to any particular argument. When this is done we shall say that we are dealing with functions in intension. The notion of difference in meaning between two rules of correspondence is a vague one, but, in terms of some system of notation, it can be made exact in various ways. We shall not attempt to decide what is the true notion of difference in meaning but shall speak of functions in intension in any case where a more severe criterion of identity is adopted than for functions in extension. There is thus not one notion of function in intension, but many notions, involving various degrees of intensionality.

In the calculus of λ-conversion and the calculus of restricted λ-K-conversion, as developed below, it is possible, if desired, to interpret the expressions of the calculus as denoting functions in extension. However, in the calculus of λ-δ-conversion, where the notion of identity of functions is introduced into the system by the symbol δ, it is necessary, in order to preserve the finitary character of the transformation rules, so to formulate these rules that an interpretation by functions in extension becomes impossible. The expressions which appear in the calculus of λ-δ-conversion are interpretable as denoting functions in intension of an appropriate kind.

3. Functions of Several Variables

So far we have tacitly restricted the term “function” to functions of one variable (or, of one argument). It is desirable, however, for each positive integer n, to have the notion of a function of n variables. And, in order to avoid the introduction of a separate primitive idea for each n, it is desirable to find a means of explaining functions of n variables as particular cases of functions of one variable. For our present purpose, the most convenient and natural method of doing this is to adopt an idea of Schönfinkel (1924), according to which a function of two variables is regarded as a function of one variable whose values are functions of one variable, a function of three variables as a function of one variable whose values are functions of two variables, and so on.

Thus if f denotes a particular function of two variables, the notation ((fa)b) — which we shall frequently abbreviate as (fab) or fab — represents the value of f for the arguments a, b. The notation (fa) — which we shall frequently abbreviate as fa — represents a function of one variable, whose value for any argument x is fax. The function f has a range of arguments, and the notation fa is meaningful only when a belongs to that range; the function fa again has a range of arguments, which is, in general different for differest elements a, and the notation fab is meaningful only when b belongs to that range of arguments of fa.

Similarly, if f denotes a function of three variables, (((fa)b)c) or fabc denotes a value of f for the arguments a,b,c, fa denoting a certain function of two variables, and ((fa)b) or fab denoting a certain function of one variable — and so on.

(According to another scheme, which is the better one for certain purposes, a function of two variables is regarded as a function (of one variable) whose arguments are ordered pairs, a function of three variables as a function whose arguments are triads, and so on. This other concept of a function of several variables is not, however excluded here. For, as will appear below, the notions of ordered pair, ordered triad, etc., are definable by means of abstraction (§4) and the Schönfinkel concept of a function of several variables; and thus functions of several variables in the other sense are also provided for.)

An example of a function of two variables (in the sense of Schönfinkel) is the constancy function K, defined by the rule that Kxy is x, whatever x or y may be. We have, for instance that KII is I, KHI is H, and so on. Also KI is H (where H is the function defined above in §1). Similarly KK is a function whose value is constant and equal to K.

Another example of a function of two variables is the function whose value for the arguments f, x is (fx); for reasons which will appear later we designate this function by the symbol 1. This function 1, regarded as a function of one variable, is a kind of identity function, since the notation (1f) whenever significant, denotes the same function as f; the function I and 1 are not, however, the same function, since the range of the argument consists in one case of all things whatsoever, in the other case merely of all functions.

Other examples of functions of two or more variables are the function H, already defined, and the functions T, J, B, C, W, S, defined respectively by the rules that Txf is (fx), Jfxyz is fx(fzy), Bfgx is f(gx), Cfxy is (fyx), Wfx is (fxx), Snfx is f(nfx).

Of these, B and C may be more familiar to the reader under other names, as the product or resultant of two transformations f and g, and as the converse of a function of two variables f. To say that BII is I is to say that the product of the identity transformation by the identity transformation is the identity transformation whatever the domain within which the transformations are being considered; to say that B11 is 1 is to say that within any domain consisting entirely of functions the product of the identity transformation by itself is the identity transformation. BI is 1, since it is the operation of composition with the identity transformation, and thus an identity operation, but one applicable only to transformations.

The reader may further verify that CK is H, CT is 1, C1 is T — that 1 and I have the same converse is explained by the fact that, while not the same functions, they have the same effect in all cases where they can significantly be applied to two arguments. The function BCC, the converse of the converse, has the effect of an identity when applied to a function of two variables, but when applied to a function of one variable it has the effect of so restricting he range of the arguments as to transform the function into a function of two variables (if possible); thus BCCI is 1.

There are many similar relations between these functions, some of them quite complicated.

4. Abstraction

For our present purpose it is necessary to distinguish carefully between a symbol or expression which denotes a function and an expression which contains a variable and denotes ambiguously some value of the functions — a distinction which is more or less obscured in the usual language of mathematical function theory.

To take an example from the theory of functions of natural numbers, consider the expression (x²+x)². If we say, “(x²+x)² is greater than 1,000” we make a statement which depends on x and actually has no meaning unless x is determined as some particular natural number. On the other hand, if we say, “(x²+x)² is a primitive recursive function,” we make a definitive statement whose meaning in no way depends on a determination of the variable x (so that in this case x plays the rôle of an apparent, or bound, variable). The difference between the two cases is that in the first case the expression (x²+x)² serves as an ambiguous, or variable, denotation of a natural number, while in the second case it serves as the denotation of a particular function. We shall hereafter distinguish by using (x²+x)² when we intend an ambiguous denotation of a natural number, but (λx(x²+x)²) as the denotation of the corresponding function — and likewise in other cases.

(It is, of course, irrelevant here that the notation (x²+x)² is commonly used also for a certain function of real numbers, a certain function of complex numbers, etc. In a logically exact notation the function, addition of natural numbers, addition of complex numbers, would be denoted by different symbols, say +_n, +_r, +_c, and these three functions, square of a natural number, square of a real number, square of a complex number, would be similarly distinguished. The uncertainties as to the exact meaning of the notation (x²+x)², and the consequent uncertainty as to the range of the arguments of the function (x²+x)² would then disappear.)

In general if M is an expression containing a variable x (as a free variable, i.e. in such a way that the meaning of M depends on a determination of x), then (λxM) denotes a function whose value, for argument α is denoted by the result of substituting (a symbol denoting) α for x in M. The range of arguments of the function (λxM) consists of all objects α such the expression M has a meaning when (a symbol denoting) α is substituted for x.

If M does not contain the variable x (as a free variable), then (λxM) might be used to denote a function whose value is constant and equal to (the thing denoted by) M, and whose range of arguments consists of all things. This usage is contemplated below in connection with the calculi of λ-K-conversion, but is excluded from the calculi of λ-conversion and λ-δ-conversion — for technical reasons which will appear.

Notice that, although x occurs as a free variable in M, nevertheless, in the expression (λxM), x is a bound or apparent, variable. Example: the equation (x²+x)² = (y²+y)² expresses a relation between the natural numbers denoted by x and y and its truth depends on a determination of x and y (in fact, it is true if and only if x and y are determined as denoting the same natural number); but the equation (λx(x²+x)²) = (λy(y²+y)²) expresses a particular proposition — namely that (λx(x²+x)²) is the same function as (λy(y²+y)²) — and it is true (there is no question of a determination of x and y).

Notice that λ, or λx, is not the name of any function or other abstract object, but is an incomplete symbol — i.e. the symbol has no meaning alone, but appropriate formed expressions containing the symbol have meaning. We call the symbol λx an abstraction operator, and speak of the function which is denoted by (λxM) as obtained from the expression M by abstraction.

The expression (λx(λyM)), which we shall often abbreviate (λxy.M), denoted a function whose value, for an argument whose value is denoted by x, is denoted by (λyM) — thus a function whose values are functions, or a function of two variables. The expression (λy(λxM)), abbreviated as (λyx.M) denotes the converse function to that denoted by (λyx.M). Similarly (λx(λy(λzM))), abbreviated as (λxyz.M), denotes a function of three variables, and so on.

Functions introduced in previous sections as examples can now be expressed, if desired, by means of abstraction operators. For instance I is (λxx); J is (λfxyz.fx(fzy)); S is (λnfx.f(nfx)); H is (λxI), or (λx(λyy)) or (λxy.y); K is (λxy.x); 1 is (λfx.fx).

Chapter II: Lambda Conversion

5. PRIMITIVE SYMBOLS, AND FORMULAS.

We turn now to the development of a formal system, which we shall call the calculus of λ-conversion, and which shall have as a possible interpretation or appilcation the system or ideas about functions of chapter I.

The primitive symbols of this calculus are three symbols,

λ, (, ),

which we shall call improper symbols, and an infinite list of symbols,

a, b, c, ... , x, y, z, a', b', ... , z', a'' ... ,

Which we shall call variables. The order in which the symbols appear in this originally given list shall be called their alphabetical order. A formula is any finite sequence of primitive symbols. Certain formulas are distinguished as well-formed formulas, and each occurrence of variable in a well-formed formula is distinguished as free or bound, in accordance with the following rules (1−4),

1. A variable x is a well-formed formula, and the occurrence of the variable x is this formula is free.

2. If F and A are well-formed, (FA) is well-formed, and an occurrence of a variable y in F is free or bound in (FA) according as it is free or bound in F, and an occurrence of a variable y in A is free or bound in (FA) according as it is free or bound in A according as it is free or bound in A.

3. If M is well-formed and contains at least one free occurrence of x, then (λxM) is well-formed, and an occurrence of variable y, other than x, in (λxM) according as it is free or bound in M. All occurrences of x are bound in (λxM).

4. A formula is well-formed, and an occurrence of a variable in it is free, or is bound, only when this follows from 1−3.

The free variables of a formula are the variables with at least one free occurrence in the formula. The bound variables of a formula are the variables with at least one bound occurrence in the formula.

Hereafter (as was just done in the statement of the rules 1−4) we shall use bold capital letters to stand for variable or undetermined formulas, and bold small letters to stand for variable or undetermined variables. Unless otherwise indicated in a particular case, it is to be understood that the formulas represented by bold letters are well-formed formulas. Bold letters are thus not part of the calculus we are developing but a device for talking about the calculus: they belong, not to the system itself, but to the metamathematics or syntax of the system.

Another syntactical notation we shall use is the notation,

S_N^xM|

which shall stand for the formula which results by substitution of N for x throughout M. This formula is well-formed, except in the case that x is a bound variable of M and N is other than a single variable — see §7. (In the special case that x does not occur in M, it is the same formula as M.)

For brevity and perspicuity in dealing with particular well-formed formulas, we ofter do not write them in full but employ various abbreviations.

One methd of abbreviation is by means of a nominal definitions, which introduces a particular new symbol to replace or stand for a particular well-formed formula. We indicate such a nominal definition by an arrow, pointing from the new symbol which is being introduced to the well-formed formula which it is to replace (the arrow my be read “stands for”). As an example we make at once the nominal definition:

I → (λaa).

This means that I will be used as an abbreviation for (λaa) — and consequently that (II) will be used as an abbreviation for ((λaa)(λaa)) and (λa(aI)) as an abbreviation for (λa(a(λaa))), etc.