StlcThe Simply Typed Lambda-Calculus


Require Export Types.

The Simply Typed Lambda-Calculus

Our job for this chapter: Formalize a small functional language and its type system
Language: The simply typed lambda-calculus.
  • A small subset of Coq's built-in functional language...
  • ...but we'll use different concrete syntax (for consistency with standard treatments of the STLC)
Main new technical challenges:
  • variable binding
  • substitution

Overview

Begin with some set of base types (here, just Bool)
Add:
  • variables
  • function abstractions
  • application
Informal concrete syntax:
       t ::= x                       variable
           | λx:T1.t2                abstraction
           | t1 t2                   application
           | true                    constant true
           | false                   constant false
           | if t1 then t2 else t3   conditional
The λ symbol (backslash, in ascii) in a function abstraction λx:T1.t2 is generally written as a greek letter "lambda" (hence the name of the calculus). The variable x is called the parameter to the function; the term t1 is its body. The annotation :T specifies the type of arguments that the function can be applied to.

Some examples:
  • λx:Bool. x
    The identity function for booleans.
  • x:Bool. x) true
    The identity function for booleans, applied to the boolean true.
  • λx:Bool. if x then false else true
    The boolean "not" function.
  • λx:Bool. true
    The constant function that takes every (boolean) argument to true.

  • λx:Bool. λy:Bool. x
    A two-argument function that takes two booleans and returns the first one. (Note that, as in Coq, a two-argument function is really a one-argument function whose body is also a one-argument function.)
  • x:Bool. λy:Bool. x) false true
    A two-argument function that takes two booleans and returns the first one, applied to the booleans false and true.
    Note that, as in Coq, application associates to the left — i.e., this expression is parsed as ((λx:Bool. λy:Bool. x) false) true.
  • λf:BoolBool. f (f true)
    A higher-order function that takes a function f (from booleans to booleans) as an argument, applies f to true, and applies f again to the result.
  • f:BoolBool. f (f true)) x:Bool. false)
    The same higher-order function, applied to the constantly false function.

Note that all functions are anonymous.
We'll see how to add named function declarations as syntactic sugar in the next chapter.

The types of the STLC include Bool for boolean values and arrow types for functions.
      T ::= Bool
          | T1  T2
For example:
  • λx:Bool. false has type BoolBool
  • λx:Bool. x has type BoolBool
  • x:Bool. x) true has type Bool
  • λx:Bool. λy:Bool. x has type BoolBoolBool (i.e. Bool (BoolBool))
  • x:Bool. λy:Bool. x) false has type BoolBool
  • x:Bool. λy:Bool. x) false true has type Bool

What is the type of the following term?
 λf:BoolBool. f (f true)
(1) Bool (Bool Bool)
(2) (BoolBool) Bool
(3) BoolBool
(4) Bool
(5) none of the above

How about this?
 (λf:BoolBool. f (f true)) (λx:Bool. false)
(1) Bool (Bool Bool)
(2) (BoolBool) Bool
(3) BoolBool
(4) Bool
(5) none of the above

Syntax

Let's begin by formalizing the syntax of the STLC.


Types


Inductive ty : Type :=
  | TBool : ty
  | TArrow : ty ty ty.

Terms


Inductive tm : Type :=
  | tvar : id tm
  | tapp : tm tm tm
  | tabs : id ty tm tm
  | ttrue : tm
  | tfalse : tm
  | tif : tm tm tm tm.

Tactic Notation "t_cases" tactic(first) ident(c) :=
  first;
  [ Case_aux c "tvar" | Case_aux c "tapp"
  | Case_aux c "tabs" | Case_aux c "ttrue"
  | Case_aux c "tfalse" | Case_aux c "tif" ].

Note that an abstraction λx:T.t (formally, tabs x T t) is always annotated with the type T of its parameter, in contrast to Coq (and other functional languages like ML, Haskell, etc.), which use type inference to fill in missing annotations. We're not considering type inference here, to keep things simple.

Some examples...

Definition x := (Id 0).
Definition y := (Id 1).
Definition z := (Id 2).
Hint Unfold x.
Hint Unfold y.
Hint Unfold z.

idB = λx:Bool. x

Notation idB :=
  (tabs x TBool (tvar x)).

idBB = λx:BoolBool. x

Notation idBB :=
  (tabs x (TArrow TBool TBool) (tvar x)).

idBBBB = λx:(BoolBool) (BoolBool). x

Notation idBBBB :=
  (tabs x (TArrow (TArrow TBool TBool)
                      (TArrow TBool TBool))
    (tvar x)).

k = λx:Bool. λy:Bool. x

Notation k := (tabs x TBool (tabs y TBool (tvar x))).

notB = λx:Bool. if x then false else true

Notation notB := (tabs x TBool (tif (tvar x) tfalse ttrue)).

(We write these as Notations rather than Definitions to make things easier for auto.)

Operational Semantics

To define the small-step semantics of STLC terms...
  • We begin by defining the set of values.
  • Next, we define free variables and substitution. (These are used in the reduction rule for application expressions.)
  • Finally, we give the small-step relation itself.

Values

To define the values of the STLC, we have a few cases to consider.
First, for the boolean part of the language, the situation is clear: true and false are the only values. An if expression is never a value.

Second, an application is clearly not a value: It represents a function being invoked on some argument, which clearly still has work left to do.

Third, for abstractions, we have a choice:
  • We can say that λx:T.t1 is a value only when t1 is a value — i.e., only if the function's body has been reduced (as much as it can be without knowing what argument it is going to be applied to).
  • Or we can say that λx:T.t1 is always a value, no matter whether t1 is one or not — in other words, we can say that reduction stops at abstractions.
Coq, in its built-in functional programming langauge, makes the first choice — for example,
         Eval simpl in (fun x:bool ⇒ 3 + 4)
yields fun x:bool 7.
Most real-world functional programming languages make the second choice — reduction of a function's body only begins when the function is actually applied to an argument. We also make the second choice here.

Finally, having made the choice not to reduce under abstractions, we don't need to worry about whether variables are values, since we'll always be reducing programs "from the outside in," and that means the step relation will always be working with closed terms (ones with no free variables).

Inductive value : tm Prop :=
  | v_abs : x T t,
      value (tabs x T t)
  | v_true :
      value ttrue
  | v_false :
      value tfalse.


Free Variables and Substitution

Now we come to the heart of the STLC: the operation of substituting one term for a variable in another term.
This operation will be used below to define the operational semantics of function application, where we will need to substitute the argument term for the function parameter in the function's body. For example, we reduce
       (λx:Bool. if x then true else xfalse
to
       if false then true else false
by substituting false for the parameter x in the body of the function.
In general, we need to be able to substitute some given term s for occurrences of some variable x in another term t. In informal discussions, this is usually written [x:=s]t and pronounced "substitute x with s in t."

Here are some examples:
  • [x:=true] (if x then x else false) yields if true then true else false
  • [x:=true] x yields true
  • [x:=true] (if x then x else y) yields if true then true else y
  • [x:=true] y yields y
  • [x:=true] false yields false (vacuous substitution)
  • [x:=true] y:Bool. if y then x else false) yields λy:Bool. if y then true else false
  • [x:=true] y:Bool. x) yields λy:Bool. true
  • [x:=true] y:Bool. y) yields λy:Bool. y
  • [x:=true] x:Bool. x) yields λx:Bool. x
The last example is very important: substituting x with true in λx:Bool. x does not yield λx:Bool. true! The reason for this is that the x in the body of λx:Bool. x is bound by the abstraction: it is a new, local name that just happens to be spelled the same as some global name x.

Here is the definition, informally...
   [x:=s]x = s
   [x:=s]y = y                                   if x ≠ y
   [x:=s](λx:T11.t12)   = λx:T11. t12      
   [x:=s](λy:T11.t12)   = λy:T11. [x:=s]t12      if x ≠ y
   [x:=s](t1 t2)        = ([x:=s]t1) ([x:=s]t2)       
   [x:=s]true           = true
   [x:=s]false          = false
   [x:=s](if t1 then t2 else t3) = 
                   if [x:=s]t1 then [x:=s]t2 else [x:=s]t3
... and formally:

Reserved Notation "'[' x ':=' s ']' t" (at level 20).

Fixpoint subst (x:id) (s:tm) (t:tm) : tm :=
  match t with
  | tvar x'
      if eq_id_dec x x' then s else t
  | tabs x' T t1
      tabs x' T (if eq_id_dec x x' then t1 else ([x:=s] t1))
  | tapp t1 t2
      tapp ([x:=s] t1) ([x:=s] t2)
  | ttrue
      ttrue
  | tfalse
      tfalse
  | tif t1 t2 t3
      tif ([x:=s] t1) ([x:=s] t2) ([x:=s] t3)
  end

where "'[' x ':=' s ']' t" := (subst x s t).

Technical note: Substitution becomes trickier to define if we consider the case where s, the term being substituted for a variable in some other term, may itself contain free variables. Since we are only interested here in defining the step relation on closed terms (i.e., terms like λx:Bool. x, that do not mention variables are not bound by some enclosing lambda), we can skip this extra complexity here, but it must be dealt with when formalizing richer languages.

What is the result of the following substitution?
       [x:=s](λy:T11.x (λx:T12.x))
(1) y:T11.x x:T12.x))
(2) y:T11.s x:T12.s))
(3) y:T11.s x:T12.x))
(4) none of the above

Reduction

value v2 (ST_AppAbs)  

(λx:T.t12) v2  [x:=v2]t12
t1  t1' (ST_App1)  

t1 t2  t1' t2
value v1
t2  t2' (ST_App2)  

v1 t2  v1 t2'
(plus the usual rules for booleans).
This is call by value reduction: to reduce an application (t1 t2),
  • first reduce t1 to a value (a function)
  • then reduce the argument t2 to a value
  • then reduce the application itself by substituting t2 for the bound variable x in the body t1.
The ST_AppAbs rule is often called beta-reduction.

Reserved Notation "t1 '' t2" (at level 40).

Inductive step : tm tm Prop :=
  | ST_AppAbs : x T t12 v2,
         value v2
         (tapp (tabs x T t12) v2) [x:=v2]t12
  | ST_App1 : t1 t1' t2,
         t1 t1'
         tapp t1 t2 tapp t1' t2
  | ST_App2 : v1 t2 t2',
         value v1
         t2 t2'
         tapp v1 t2 tapp v1 t2'
  | ST_IfTrue : t1 t2,
      (tif ttrue t1 t2) t1
  | ST_IfFalse : t1 t2,
      (tif tfalse t1 t2) t2
  | ST_If : t1 t1' t2 t3,
      t1 t1'
      (tif t1 t2 t3) (tif t1' t2 t3)

where "t1 '' t2" := (step t1 t2).


Examples

Example:
    ((λx:BoolBool. x) (λx:Bool. x)) ⇒* (λx:Bool. x)
i.e.
    (idBB idB⇒* idB

Lemma step_example1 :
  (tapp idBB idB) ⇒* idB.
Proof.
  eapply multi_step.
    apply ST_AppAbs.
    apply v_abs.
  simpl.
  apply multi_refl. Qed.

Example:
((λx:BoolBool. x) ((λx:BoolBool. x) (λx:Bool. x))) 
      ⇒* (λx:Bool. x)
i.e.
  (idBB (idBB idB)) ⇒* idB.

Lemma step_example2 :
  (tapp idBB (tapp idBB idB)) ⇒* idB.
Proof.
  eapply multi_step.
    apply ST_App2. auto.
    apply ST_AppAbs. auto.
  eapply multi_step.
    apply ST_AppAbs. simpl. auto.
  simpl. apply multi_refl. Qed.

Example:
((λx:BoolBool. x) (λx:Bool. if x then false
                              else true)) true)
      ⇒* false
i.e.
  ((idBB notBttrue⇒* tfalse.

Lemma step_example3 :
  tapp (tapp idBB notB) ttrue ⇒* tfalse.
Proof.
  eapply multi_step.
    apply ST_App1. apply ST_AppAbs. auto. simpl.
  eapply multi_step.
    apply ST_AppAbs. auto. simpl.
  eapply multi_step.
    apply ST_IfTrue. apply multi_refl. Qed.

Example:
((λx:BoolBool. x) ((λx:Bool. if x then false
                               else truetrue))
      ⇒* false
i.e.
  (idBB (notB ttrue)) ⇒* tfalse.

Lemma step_example4 :
  tapp idBB (tapp notB ttrue) ⇒* tfalse.
Proof.
  eapply multi_step.
    apply ST_App2. auto.
    apply ST_AppAbs. auto. simpl.
  eapply multi_step.
    apply ST_App2. auto.
    apply ST_IfTrue.
  eapply multi_step.
    apply ST_AppAbs. auto. simpl.
  apply multi_refl. Qed.

Do values and normal forms coincide in the language presented so far?
(1) yes
(2) no

Typing


Contexts

Question: What is the type of the term "x y"?
Answer: It depends on the types of x and y!
I.e., in order to assign a type to a term, we need to know what assumptions we should make about the types of its free variables.
This leads us to a three-place "typing judgment", informally written Γ t : T, where Γ is a "typing context" — a mapping from variables to their types.

We hide the definition of partial maps in a module since it is actually defined in SfLib.

Module PartialMap.

Definition partial_map (A:Type) := id option A.

Definition empty {A:Type} : partial_map A := (fun _None).

Informally, we'll write Γ, x:T for "extend the partial function Γ to also map x to T." Formally, we use the function extend to add a binding to a partial map.

Definition extend {A:Type} (Γ : partial_map A) (x:id) (T : A) :=
  fun x'if eq_id_dec x x' then Some T else Γ x'.

End PartialMap.

Definition context := partial_map ty.

Typing Relation

Γ x = T (T_Var)  

Γ  x ∈ T
Γ , x:T11  t12 ∈ T12 (T_Abs)  

Γ  λx:T11.t12 ∈ T11->T12
Γ  t1 ∈ T11->T12
Γ  t2 ∈ T11 (T_App)  

Γ  t1 t2 ∈ T12
   (T_True)  

Γ  true ∈ Bool
   (T_False)  

Γ  false ∈ Bool
Γ  t1 ∈ Bool    Γ  t2 ∈ T    Γ  t3 ∈ T (T_If)  

Γ  if t1 then t2 else t3 ∈ T
We can read the three-place relation Γ t T as: "to the term t we can assign the type T using as types for the free variables of t the ones specified in the context Γ."


Examples


Example typing_example_1 :
  empty tabs x TBool (tvar x) ∈ TArrow TBool TBool.
Proof.
  apply T_Abs. apply T_Var. reflexivity. Qed.

Note that since we added the has_type constructors to the hints database, auto can actually solve this one immediately.

Example typing_example_1' :
  empty tabs x TBool (tvar x) ∈ TArrow TBool TBool.
Proof. auto. Qed.

Another example:
     empty  λx:A. λy:AA. y (y x)) 
           ∈ A  (AA A.

Example typing_example_2 :
  empty
    (tabs x TBool
       (tabs y (TArrow TBool TBool)
          (tapp (tvar y) (tapp (tvar y) (tvar x))))) ∈
    (TArrow TBool (TArrow (TArrow TBool TBool) TBool)).
Proof with auto using extend_eq.
  apply T_Abs.
  apply T_Abs.
  eapply T_App. apply T_Var...
  eapply T_App. apply T_Var...
  apply T_Var...
Qed.

   empty  λx:BoolB. λy:BoolBool. λz:Bool.
               y (x z
         ∈ T.

We can also show that terms are not typable. For example, let's formally check that there is no typing derivation assigning a type to the term λx:Bool. λy:Bool, x y — i.e.,
    ¬ T,
        empty  λx:Bool. λy:Boolx y : T.

Example typing_nonexample_1 :
  ¬ T,
      empty
        (tabs x TBool
            (tabs y TBool
               (tapp (tvar x) (tvar y)))) ∈
        T.
Proof.
  intros Hc. inversion Hc.
  (* The clear tactic is useful here for tidying away bits of
     the context that we're not going to need again. *)

  inversion H. subst. clear H.
  inversion H5. subst. clear H5.
  inversion H4. subst. clear H4.
  inversion H2. subst. clear H2.
  inversion H5. subst. clear H5.
  (* rewrite extend_neq in H1. rewrite extend_eq in H1. *)
  inversion H1. Qed.

Another nonexample:
    ¬ (ST,
          empty  λx:S. x x : T).

Which of the following propositions is not provable?
(1) y:Bool λx:Bool.x BoolBool
(2) T, empty λy:BoolBool. λx:Bool. y x T
(3) T, empty λy:BoolBool. λx:Bool. x y T
(4) S, x:S λy:BoolBool. y x S

How about the following?
(1) T, empty λy:BoolBoolBool. λx:Bool. y x T
(2) S, T, x:S x x x T
(3) S, U, T, x:S, y:U λz:Bool. x (y z) T
(4) S, T, x:S λy:Bool. x (x y) T

End STLC.