Simple Extensions to STLC

The simply typed lambda-calculus has enough structure to make its theoretical properties interesting, but it is not much of a programming language. In this chapter, we begin to close the gap with real-world languages by introducing a number of familiar features that have straightforward treatments at the level of typing.

Numbers

Adding types, constants, and primitive operations for numbers is easy — just a matter of combining the Types and Stlc chapters.

let-bindings

When writing a complex expression, it is often useful to give names to some of its subexpressions: this avoids repetition and often increases readability.

Syntax:

       t ::=                Terms
           | ...               (other terms same as before)
           | let x=t in t      let-binding

Reduction:

t₁ ⇒ t₁'	(ST_Let1)

let x=t₁ in t₂ ⇒ let x=t₁' in t₂

	(ST_LetValue)

let x=v₁ in t₂ ⇒ [x:=v₁]t₂

Typing:

Γ ⊢ t₁ : T₁ Γ , x:T₁ ⊢ t₂ : T₂	(T_Let)

Γ ⊢ let x=t₁ in t₂ : T₂

Pairs

In Coq, the primitive way of extracting the components of a pair is pattern matching. An alternative style is to take fst and snd — the first- and second-projection operators — as primitives. Just for fun, let's do our products this way. For example, here's how we'd write a function that takes a pair of numbers and returns the pair of their sum and difference:

       λx:Nat*Nat. 
          let sum = x.fst + x.snd in
          let diff = x.fst - x.snd in
          (sum,diff)

Syntax:

       t ::=                Terms
           | ...               
           | (t,t)             pair
           | t.fst             first projection
           | t.snd             second projection

       v ::=                Values
           | ...
           | (v,v)             pair value

       T ::=                Types
           | ...
           | T * T             product type

For evaluation, we need several new rules specifying how pairs and projection behave.

t₁ ⇒ t₁'	(ST_Pair1)

(t₁,t₂) ⇒ (t₁',t₂)

t₂ ⇒ t₂'	(ST_Pair2)

(v₁,t₂) ⇒ (v₁,t₂')

t₁ ⇒ t₁'	(ST_Fst1)

t₁.fst ⇒ t₁'.fst

	(ST_FstPair)

(v₁,v₂).fst ⇒ v₁

t₁ ⇒ t₁'	(ST_Snd1)

t₁.snd ⇒ t₁'.snd

	(ST_SndPair)

(v₁,v₂).snd ⇒ v₂

The typing rules for pairs and projections are straightforward.

Γ ⊢ t₁ : T₁ Γ ⊢ t₂ : T₂	(T_Pair)

Γ ⊢ (t₁,t₂) : T₁*T₂

Γ ⊢ t₁ : T₁₁*T₁₂	(T_Fst)

Γ ⊢ t₁.fst : T₁₁

Γ ⊢ t₁ : T₁₁*T₁₂	(T_Snd)

Γ ⊢ t₁.snd : T₁₂

Unit

Another handy base type, found especially in languages in the ML family, is the singleton type Unit.

Syntax:

       t ::=                Terms
           | ...               
           | unit              unit value

       v ::=                Values
           | ...     
           | unit              unit

       T ::=                Types
           | ...
           | Unit              Unit type

Typing:

	(T_Unit)

Γ ⊢ unit : Unit

Sums

Many programs need to deal with values that can take two distinct forms. For example, we might identify employees in an accounting application using using either their name or their id number. A search function might return either a matching value or an error code.

These are specific examples of a binary sum type, which describes a set of values drawn from exactly two given types, e.g.

       Nat + Bool

We create elements of these types by tagging elements of the component types, telling on which side we are putting them. E.g.,

   inl 42 : Nat + Bool
   inr true : Nat + Bool

In general, the elements of a type T₁ + T₂ consist of the elements of T₁ tagged with the token inl, plus the elements of T₂ tagged with inr.

One important usage of sums is signaling errors:

    div : Nat -> Nat -> (Nat + Unit) =
    div =
      λx:Nat. λy:Nat.
        if iszero y then
          inr unit
        else
          inl ...

The type Nat + Unit above is in fact isomorphic to option nat in Coq, and we've already seen how to signal errors with options.

Here is a simple example showing how to do case analysis in order to use values of sum type:

    getNat = 
      λx:Nat+Bool.
        case x of
          inl n => n
        | inr b => if b then 1 else 0

Syntax:

       t ::=                Terms
           | ...               
           | inl T t           tagging (left)
           | inr T t           tagging (right)
           | case t of         case
               inl x => t
             | inr x => t 

       v ::=                Values
           | ...
           | inl T v           tagged value (left)
           | inr T v           tagged value (right)

       T ::=                Types
           | ...
           | T + T             sum type

Evaluation:

t₁ ⇒ t₁'	(ST_Inl)

inl T t₁ ⇒ inl T t₁'

t₁ ⇒ t₁'	(ST_Inr)

inr T t₁ ⇒ inr T t₁'

t0 ⇒ t0'	(ST_Case)

case t0 of inl x1 ⇒ t₁ \| inr x2 ⇒ t₂ ⇒
case t0' of inl x1 ⇒ t₁ \| inr x2 ⇒ t₂

	(ST_CaseInl)

case (inl T v0) of inl x1 ⇒ t₁ \| inr x2 ⇒ t₂
⇒ [x1:=v0]t₁

	(ST_CaseInr)

case (inr T v0) of inl x1 ⇒ t₁ \| inr x2 ⇒ t₂
⇒ [x2:=v0]t₂

Typing:

Γ ⊢ t₁ : T₁	(T_Inl)

Γ ⊢ inl T₂ t₁ : T₁ + T₂

Γ ⊢ t₁ : T₂	(T_Inr)

Γ ⊢ inr T₁ t₁ : T₁ + T₂

Γ ⊢ t0 : T₁+T₂
Γ , x1:T₁ ⊢ t₁ : T
Γ , x2:T₂ ⊢ t₂ : T	(T_Case)

Γ ⊢ case t0 of inl x1 ⇒ t₁ \| inr x2 ⇒ t₂ : T

We use the type annotation in inl and inr to make the typing simpler, similarly to what we did for functions.

Lists

Syntax:

       t ::=                Terms
           | ...
           | nil T
           | cons t t
           | lcase t of nil -> t | x::x -> t

       v ::=                Values
           | ...
           | nil T             nil value
           | cons v v          cons value

       T ::=                Types
           | ...
           | List T            list of Ts

Reduction:

t₁ ⇒ t₁'	(ST_Cons1)

cons t₁ t₂ ⇒ cons t₁' t₂

t₂ ⇒ t₂'	(ST_Cons2)

cons v₁ t₂ ⇒ cons v₁ t₂'

t₁ ⇒ t₁'	(ST_Lcase1)

(lcase t₁ of nil → t₂ \| xh::xt → t₃) ⇒
(lcase t₁' of nil → t₂ \| xh::xt → t₃)

	(ST_LcaseNil)

(lcase nil T of nil → t₂ \| xh::xt → t₃)
⇒ t₂

	(ST_LcaseCons)

(lcase (cons vh vt) of nil → t₂ \| xh::xt → t₃)
⇒ [xh:=vh,xt:=vt]t₃

Typing:

	(T_Nil)

Γ ⊢ nil T : List T

Γ ⊢ t₁ : T Γ ⊢ t₂ : List T	(T_Cons)

Γ ⊢ cons t₁ t₂: List T

Γ ⊢ t₁ : List T₁
Γ ⊢ t₂ : T
Γ , h:T₁, t:List T₁ ⊢ t₃ : T	(T_Lcase)

Γ ⊢ (lcase t₁ of nil → t₂ \| h::t → t₃) : T

General Recursion

Another facility found in most programming languages (including Coq) is the ability to define recursive functions. For example, we might like to be able to define the factorial function like this:

   fact = λx:Nat. 
             if x=0 then 1 else x * (fact (pred x)))

But this would require quite a bit of work to formalize: we'd have to introduce a notion of "function definitions" and carry around an "environment" of such definitions in the definition of the step relation.

Here is another way that is straightforward to formalize: instead of writing recursive definitions where the right-hand side can contain the identifier being defined, we can define a fixed-point operator that performs the "unfolding" of the recursive definition in the right-hand side lazily during reduction.

   fact = 
       fix
         (λf:Nat->Nat.
            λx:Nat. 
               if x=0 then 1 else x * (f (pred x)))

Syntax:

       t ::=                Terms
           | ...
           | fix t             fixed-point operator

Reduction:

t₁ ⇒ t₁'	(ST_Fix1)

fix t₁ ⇒ fix t₁'

F = λxf:T₁.t₂	(ST_FixAbs)

fix F ⇒ [xf:=fix F]t₂

Typing:

Γ ⊢ t₁ : T₁->T₁	(T_Fix)

Γ ⊢ fix t₁ : T₁

Let's see how ST_FixAbs works by reducing fact 3 = fix F 3, where F = (λf. λx. if x=0 then 1 else x × (f (pred x))) (we are omitting type annotations for brevity here).

fix F 3

⇒ ST_FixAbs

(λx. if x=0 then 1 else x * (fix F (pred x))) 3

⇒ ST_AppAbs

if 3=0 then 1 else 3 * (fix F (pred 3))

⇒ ST_If0_Nonzero

3 * (fix F (pred 3))

⇒ ST_FixAbs + ST_Mult2

3 * ((λx. if x=0 then 1 else x * (fix F (pred x))) (pred 3))

⇒ ST_PredNat + ST_Mult2 + ST_App2

3 * ((λx. if x=0 then 1 else x * (fix F (pred x))) 2)

⇒ ST_AppAbs + ST_Mult2

3 * (if 2=0 then 1 else 2 * (fix F (pred 2)))

⇒ ST_If0_Nonzero + ST_Mult2

3 * (2 * (fix F (pred 2)))

⇒ ST_FixAbs + 2 x ST_Mult2

3 * (2 * ((λx. if x=0 then 1 else x * (fix F (pred x))) (pred 2)))

⇒ ST_PredNat + 2 x ST_Mult2 + ST_App2

3 * (2 * ((λx. if x=0 then 1 else x * (fix F (pred x))) 1))

⇒ ST_AppAbs + 2 x ST_Mult2

3 * (2 * (if 1=0 then 1 else 1 * (fix F (pred 1))))

⇒ ST_If0_Nonzero + 2 x ST_Mult2

3 * (2 * (1 * (fix F (pred 1))))

⇒ ST_FixAbs + 3 x ST_Mult2

3 * (2 * (1 * ((λx. if x=0 then 1 else x * (fix F (pred x))) (pred 1))))

⇒ ST_PredNat + 3 x ST_Mult2 + ST_App2

3 * (2 * (1 * ((λx. if x=0 then 1 else x * (fix F (pred x))) 0)))

⇒ ST_AppAbs + 3 x ST_Mult2

3 * (2 * (1 * (if 0=0 then 1 else 0 * (fix F (pred 0)))))

⇒ ST_If0Zero + 3 x ST_Mult2

3 * (2 * (1 * 1))

⇒ ST_MultNats + 2 x ST_Mult2

3 * (2 * 1)

⇒ ST_MultNats + ST_Mult2

3 * 2

⇒ ST_MultNats

Records

As a final example of a basic extension of the STLC, let's look briefly at how to define records and their types. Intuitively, records can be obtained from pairs by two kinds of generalization: they are n-ary products (rather than just binary) and their fields are accessed by label (rather than position).

This extension is conceptually a straightforward generalization of pairs and product types, but notationally it becomes a little heavier; for this reason, we postpone its formal treatment to a separate chapter (Records). Therefore records are not included in the extended exercise below, but they are used to motivate the Sub chapter.

Syntax:

       t ::=                          Terms
           | ...
           | {i1=t₁, ..., in=tn}         record 
           | t.i                         projection

       v ::=                          Values
           | ...
           | {i1=v₁, ..., in=vn}         record value

       T ::=                          Types
           | ...
           | {i1:T₁, ..., in:Tn}         record type

Intuitively, the generalization is pretty obvious. But it's worth noticing that what we've actually written is rather informal: in particular, we've written "..." in several places to mean "any number of these," and we've omitted explicit mention of the usual side-condition that the labels of a record should not contain repetitions. It is possible to devise informal notations that are more precise, but these tend to be quite heavy and to obscure the main points of the definitions. So we'll leave these a bit loose here (they are informal anyway, after all) and do the work of tightening things up elsewhere (in chapter Records).

Reduction:

ti ⇒ ti'	(ST_Rcd)

{i1=v₁, ..., im=vm, in=ti, ...}
⇒ {i1=v₁, ..., im=vm, in=ti', ...}

t₁ ⇒ t₁'	(ST_Proj1)

t₁.i ⇒ t₁'.i

	(ST_ProjRcd)

{..., i=vi, ...}.i ⇒ vi

Again, these rules are a bit informal. For example, the first rule is intended to be read "if ti is the leftmost field that is not a value and if ti steps to ti', then the whole record steps..." In the last rule, the intention is that there should only be one field called i, and that all the other fields must contain values.

Typing:

Γ ⊢ t₁ : T₁ ... Γ ⊢ tn : Tn	(T_Rcd)

Γ ⊢ {i1=t₁, ..., in=tn} : {i1:T₁, ..., in:Tn}

Γ ⊢ t : {..., i:Ti, ...}	(T_Proj)

Γ ⊢ t.i : Ti

MoreStlcMore on the Simply Typed Lambda-Calculus

Simple Extensions to STLC

Numbers

let-bindings

Pairs

Unit

Sums

Lists

General Recursion

Records