MoreStlcMore on the Simply Typed Lambda-Calculus
Simple Extensions to STLC
The simply typed lambda-calculus has enough structure to make its
theoretical properties interesting, but it is not much of a
programming language. In this chapter, we begin to close the gap
with real-world languages by introducing a number of familiar
features that have straightforward treatments at the level of
typing.
Numbers
Adding types, constants, and primitive operations for numbers is
easy — just a matter of combining the
Types and
Stlc
chapters.
let-bindings
When writing a complex expression, it is often useful to
give names to some of its subexpressions: this avoids repetition
and often increases readability.
Syntax:
t ::= Terms
| ... (other terms same as before)
| let x=t in t let-binding
Reduction:
t1 ⇒ t1' |
(ST_Let1)
|
|
let x=t1 in t2 ⇒ let x=t1' in t2 |
|
|
(ST_LetValue)
|
|
let x=v1 in t2 ⇒ [x:=v1]t2 |
|
Typing:
Γ ⊢ t1 : T1 Γ , x:T1 ⊢ t2 : T2 |
(T_Let)
|
|
Γ ⊢ let x=t1 in t2 : T2 |
|
Pairs
In Coq, the primitive way of extracting the components of a pair
is
pattern matching. An alternative style is to take
fst and
snd — the first- and second-projection operators — as
primitives. Just for fun, let's do our products this way. For
example, here's how we'd write a function that takes a pair of
numbers and returns the pair of their sum and difference:
λx:Nat*Nat.
let sum = x.fst + x.snd in
let diff = x.fst - x.snd in
(sum,diff)
Syntax:
t ::= Terms
| ...
| (t,t) pair
| t.fst first projection
| t.snd second projection
v ::= Values
| ...
| (v,v) pair value
T ::= Types
| ...
| T * T product type
For evaluation, we need several new rules specifying how pairs and
projection behave.
t1 ⇒ t1' |
(ST_Pair1)
|
|
(t1,t2) ⇒ (t1',t2) |
|
t2 ⇒ t2' |
(ST_Pair2)
|
|
(v1,t2) ⇒ (v1,t2') |
|
t1 ⇒ t1' |
(ST_Fst1)
|
|
t1.fst ⇒ t1'.fst |
|
|
(ST_FstPair)
|
|
(v1,v2).fst ⇒ v1 |
|
t1 ⇒ t1' |
(ST_Snd1)
|
|
t1.snd ⇒ t1'.snd |
|
|
(ST_SndPair)
|
|
(v1,v2).snd ⇒ v2 |
|
The typing rules for pairs and projections are straightforward.
Γ ⊢ t1 : T1 Γ ⊢ t2 : T2 |
(T_Pair)
|
|
Γ ⊢ (t1,t2) : T1*T2 |
|
Γ ⊢ t1 : T11*T12 |
(T_Fst)
|
|
Γ ⊢ t1.fst : T11 |
|
Γ ⊢ t1 : T11*T12 |
(T_Snd)
|
|
Γ ⊢ t1.snd : T12 |
|
Unit
Another handy base type, found especially in languages in
the ML family, is the singleton type
Unit.
Syntax:
t ::= Terms
| ...
| unit unit value
v ::= Values
| ...
| unit unit
T ::= Types
| ...
| Unit Unit type
Typing:
Is there any term of type
Unit besides
unit?
(1) Yes
(2) No
Sums
Many programs need to deal with values that can take two distinct
forms. For example, we might identify employees in an accounting
application using using
either their name
or their id number.
A search function might return
either a matching value
or an
error code.
These are specific examples of a binary
sum type,
which describes a set of values drawn from exactly two given types, e.g.
Nat + Bool
We create elements of these types by tagging elements of the
component types, telling on which side we are putting them. E.g.,
inl 42 : Nat + Bool
inr true : Nat + Bool
In general, the elements of a type
T1 + T2 consist of the
elements of
T1 tagged with the token
inl, plus the elements of
T2 tagged with
inr.
One important usage of sums is signaling errors:
div : Nat -> Nat -> (Nat + Unit) =
div =
λx:Nat. λy:Nat.
if iszero y then
inr unit
else
inl ...
The type
Nat + Unit above is in fact isomorphic to
option nat
in Coq, and we've already seen how to signal errors with options.
Here is a simple example showing how to do case
analysis in order to use values of sum type:
getNat =
λx:Nat+Bool.
case x of
inl n => n
| inr b => if b then 1 else 0
Syntax:
t ::= Terms
| ...
| inl T t tagging (left)
| inr T t tagging (right)
| case t of case
inl x => t
| inr x => t
v ::= Values
| ...
| inl T v tagged value (left)
| inr T v tagged value (right)
T ::= Types
| ...
| T + T sum type
Evaluation:
t1 ⇒ t1' |
(ST_Inl)
|
|
inl T t1 ⇒ inl T t1' |
|
t1 ⇒ t1' |
(ST_Inr)
|
|
inr T t1 ⇒ inr T t1' |
|
t0 ⇒ t0' |
(ST_Case)
|
|
case t0 of inl x1 ⇒ t1 | inr x2 ⇒ t2 ⇒ |
|
case t0' of inl x1 ⇒ t1 | inr x2 ⇒ t2 |
|
|
(ST_CaseInl)
|
|
case (inl T v0) of inl x1 ⇒ t1 | inr x2 ⇒ t2 |
|
⇒ [x1:=v0]t1 |
|
|
(ST_CaseInr)
|
|
case (inr T v0) of inl x1 ⇒ t1 | inr x2 ⇒ t2 |
|
⇒ [x2:=v0]t2 |
|
Typing:
Γ ⊢ t1 : T1 |
(T_Inl)
|
|
Γ ⊢ inl T2 t1 : T1 + T2 |
|
Γ ⊢ t1 : T2 |
(T_Inr)
|
|
Γ ⊢ inr T1 t1 : T1 + T2 |
|
Γ ⊢ t0 : T1+T2 |
|
Γ , x1:T1 ⊢ t1 : T |
|
Γ , x2:T2 ⊢ t2 : T |
(T_Case)
|
|
Γ ⊢ case t0 of inl x1 ⇒ t1 | inr x2 ⇒ t2 : T |
|
We use the type annotation in
inl and
inr to make the typing
simpler, similarly to what we did for functions.
What does the following term step to?
let f = λ
x :
Nat +
Bool.
case x of inl n ⇒
n + 3 |
inr b ⇒ 0
in
f (
inl 4)
(1)
(λ
x :
Nat +
Bool.
case x of inl n ⇒
n + 3 |
inr b ⇒ 0) (
inl 4)
(2)
(3)
case inl 4
of inl n ⇒
n + 3 |
inr b ⇒ 0
(4)
What about this one?
(λ
x :
Nat +
Bool.
case x of inl n ⇒
n + 3 |
inr b ⇒ 0) (
inl 4)
(1)
(2)
case inl 4
of inl n ⇒
n + 3 |
inr b ⇒ 0
(3)
What about this one?
case inl 4
of inl n ⇒
n + 3 |
inr b ⇒ 0
(1)
4 + 3
(2)
7
(3)
0
Lists
Syntax:
t ::= Terms
| ...
| nil T
| cons t t
| lcase t of nil -> t | x::x -> t
v ::= Values
| ...
| nil T nil value
| cons v v cons value
T ::= Types
| ...
| List T list of Ts
Reduction:
t1 ⇒ t1' |
(ST_Cons1)
|
|
cons t1 t2 ⇒ cons t1' t2 |
|
t2 ⇒ t2' |
(ST_Cons2)
|
|
cons v1 t2 ⇒ cons v1 t2' |
|
t1 ⇒ t1' |
(ST_Lcase1)
|
|
(lcase t1 of nil → t2 | xh::xt → t3) ⇒ |
|
(lcase t1' of nil → t2 | xh::xt → t3) |
|
|
(ST_LcaseNil)
|
|
(lcase nil T of nil → t2 | xh::xt → t3) |
|
⇒ t2 |
|
|
(ST_LcaseCons)
|
|
(lcase (cons vh vt) of nil → t2 | xh::xt → t3) |
|
⇒ [xh:=vh,xt:=vt]t3 |
|
Typing:
|
(T_Nil)
|
|
Γ ⊢ nil T : List T |
|
Γ ⊢ t1 : T Γ ⊢ t2 : List T |
(T_Cons)
|
|
Γ ⊢ cons t1 t2: List T |
|
Γ ⊢ t1 : List T1 |
|
Γ ⊢ t2 : T |
|
Γ , h:T1, t:List T1 ⊢ t3 : T |
(T_Lcase)
|
|
Γ ⊢ (lcase t1 of nil → t2 | h::t → t3) : T |
|
General Recursion
Another facility found in most programming languages (including
Coq) is the ability to define recursive functions. For example,
we might like to be able to define the factorial function like
this:
fact = λx:Nat.
if x=0 then 1 else x * (fact (pred x)))
But this would require quite a bit of work to formalize: we'd have
to introduce a notion of "function definitions" and carry around an
"environment" of such definitions in the definition of the
step
relation.
Here is another way that is straightforward to formalize: instead
of writing recursive definitions where the right-hand side can
contain the identifier being defined, we can define a
fixed-point
operator that performs the "unfolding" of the recursive definition
in the right-hand side lazily during reduction.
fact =
fix
(λf:Nat->Nat.
λx:Nat.
if x=0 then 1 else x * (f (pred x)))
Syntax:
t ::= Terms
| ...
| fix t fixed-point operator
Reduction:
t1 ⇒ t1' |
(ST_Fix1)
|
|
fix t1 ⇒ fix t1' |
|
F = λxf:T1.t2 |
(ST_FixAbs)
|
|
fix F ⇒ [xf:=fix F]t2 |
|
Typing:
Γ ⊢ t1 : T1->T1 |
(T_Fix)
|
|
Γ ⊢ fix t1 : T1 |
|
Let's see how
ST_FixAbs works by reducing
fact 3 = fix F 3,
where
F = (λf. λx. if x=0 then 1 else x × (f (pred x))) (we are
omitting type annotations for brevity here).
fix F 3
⇒ ST_FixAbs
(λx. if x=0 then 1 else x * (fix F (pred x))) 3
⇒ ST_AppAbs
if 3=0 then 1 else 3 * (fix F (pred 3))
⇒ ST_If0_Nonzero
3 * (fix F (pred 3))
⇒ ST_FixAbs + ST_Mult2
3 * ((λx. if x=0 then 1 else x * (fix F (pred x))) (pred 3))
⇒ ST_PredNat + ST_Mult2 + ST_App2
3 * ((λx. if x=0 then 1 else x * (fix F (pred x))) 2)
⇒ ST_AppAbs + ST_Mult2
3 * (if 2=0 then 1 else 2 * (fix F (pred 2)))
⇒ ST_If0_Nonzero + ST_Mult2
3 * (2 * (fix F (pred 2)))
⇒ ST_FixAbs + 2 x ST_Mult2
3 * (2 * ((λx. if x=0 then 1 else x * (fix F (pred x))) (pred 2)))
⇒ ST_PredNat + 2 x ST_Mult2 + ST_App2
3 * (2 * ((λx. if x=0 then 1 else x * (fix F (pred x))) 1))
⇒ ST_AppAbs + 2 x ST_Mult2
3 * (2 * (if 1=0 then 1 else 1 * (fix F (pred 1))))
⇒ ST_If0_Nonzero + 2 x ST_Mult2
3 * (2 * (1 * (fix F (pred 1))))
⇒ ST_FixAbs + 3 x ST_Mult2
3 * (2 * (1 * ((λx. if x=0 then 1 else x * (fix F (pred x))) (pred 1))))
⇒ ST_PredNat + 3 x ST_Mult2 + ST_App2
3 * (2 * (1 * ((λx. if x=0 then 1 else x * (fix F (pred x))) 0)))
⇒ ST_AppAbs + 3 x ST_Mult2
3 * (2 * (1 * (if 0=0 then 1 else 0 * (fix F (pred 0)))))
⇒ ST_If0Zero + 3 x ST_Mult2
3 * (2 * (1 * 1))
⇒ ST_MultNats + 2 x ST_Mult2
3 * (2 * 1)
⇒ ST_MultNats + ST_Mult2
3 * 2
⇒ ST_MultNats
6
Records
As a final example of a basic extension of the STLC, let's
look briefly at how to define
records and their types.
Intuitively, records can be obtained from pairs by two kinds of
generalization: they are n-ary products (rather than just binary)
and their fields are accessed by
label (rather than position).
This extension is conceptually a straightforward generalization of
pairs and product types, but notationally it becomes a little
heavier; for this reason, we postpone its formal treatment to a
separate chapter (
Records). Therefore records are not included
in the extended exercise below, but they are used to motivate the
Sub chapter.
Syntax:
t ::= Terms
| ...
| {i1=t1, ..., in=tn} record
| t.i projection
v ::= Values
| ...
| {i1=v1, ..., in=vn} record value
T ::= Types
| ...
| {i1:T1, ..., in:Tn} record type
Intuitively, the generalization is pretty obvious. But it's worth
noticing that what we've actually written is rather informal: in
particular, we've written "
..." in several places to mean "any
number of these," and we've omitted explicit mention of the usual
side-condition that the labels of a record should not contain
repetitions. It is possible to devise informal notations that are
more precise, but these tend to be quite heavy and to obscure the
main points of the definitions. So we'll leave these a bit loose
here (they are informal anyway, after all) and do the work of
tightening things up elsewhere (in chapter
Records).
Reduction:
ti ⇒ ti' |
(ST_Rcd)
|
|
{i1=v1, ..., im=vm, in=ti, ...} |
|
⇒ {i1=v1, ..., im=vm, in=ti', ...} |
|
t1 ⇒ t1' |
(ST_Proj1)
|
|
t1.i ⇒ t1'.i |
|
|
(ST_ProjRcd)
|
|
{..., i=vi, ...}.i ⇒ vi |
|
Again, these rules are a bit informal. For example, the first rule
is intended to be read "if
ti is the leftmost field that is not a
value and if
ti steps to
ti', then the whole record steps..."
In the last rule, the intention is that there should only be one
field called i, and that all the other fields must contain values.
Typing:
Γ ⊢ t1 : T1 ... Γ ⊢ tn : Tn |
(T_Rcd)
|
|
Γ ⊢ {i1=t1, ..., in=tn} : {i1:T1, ..., in:Tn} |
|
Γ ⊢ t : {..., i:Ti, ...} |
(T_Proj)
|
|
Γ ⊢ t.i : Ti |
|