MoreIndMore on Induction

Require Export ProofObjects.

Induction Principles

This is a good point to pause and take a deeper look at induction principles.

Every time we declare a new Inductive datatype, Coq automatically generates and proves an induction principle for this type.

The induction principle for a type t is called t_ind. Here is the one for natural numbers:

Check nat_ind.
(*  ===> nat_ind :
           forall P : nat -> Prop,
              P 0  ->
              (forall n : nat, P n -> P (S n))  ->
              forall n : nat, P n  *)

The induction tactic is a straightforward wrapper that, at its core, simply performs apply t_ind. To see this more clearly, let's experiment a little with using apply nat_ind directly, instead of the induction tactic, to carry out some proofs. Here, for example, is an alternate proof of a theorem that we saw in the Basics chapter.

Theorem mult_0_r' : ∀n:nat,
  n × 0 = 0.
Proof.
  apply nat_ind.
  Case "O". reflexivity.
  Case "S". simpl. intros n IHn. rewrite → IHn.
    reflexivity. Qed.

Why the induction tactic is nicer to use in practice:

In the second step of the proof we have to do a little manual bookkeeping (the intros).
We do not introduce n into the context before applying nat_ind.
The apply tactic automatically chooses variable names for us (in the second subgoal, here), whereas induction gives us a way to specify what names should be used. The automatic choice is not great.

Coq generates induction principles for every datatype defined with Inductive, including those that aren't recursive. (Although we don't need induction to prove properties of non-recursive datatypes, the idea of an induction principle still makes sense for them: it gives a way to prove that a property holds for all values of the type.)

These generated principles follow a similar pattern. If we define a type t with constructors c1 ... cn, Coq generates a theorem with this shape:

    t_ind :
       ∀P : t → Prop,
            ... case for c1 ... →
            ... case for c2 ... →
            ...
            ... case for cn ... →
            ∀n : t, P n

The specific shape of each case depends on the arguments to the corresponding constructor. Before trying to write down a general rule, let's look at some more examples. First, an example where the constructors take no arguments:

Inductive yesno : Type :=
  | yes : yesno
  | no : yesno.

Check yesno_ind.
(* ===> yesno_ind : forall P : yesno -> Prop,
                      P yes  ->
                      P no  ->
                      forall y : yesno, P y *)

Here's another example, this time with one of the constructors taking some arguments.

Inductive natlist : Type :=
  | nnil : natlist
  | ncons : nat → natlist → natlist.

Check natlist_ind.
(* ===> (modulo a little variable renaming for clarity)
   natlist_ind :
      forall P : natlist -> Prop,
         P nnil  ->
         (forall (n : nat) (l : natlist), P l -> P (ncons n l)) ->
         forall n : natlist, P n *)

From these examples, we can extract this general rule:

The type declaration gives several constructors; each corresponds to one clause of the induction principle.
Each constructor c takes argument types a1...an.
Each ai can be either t (the datatype we are defining) or some other type s.
The corresponding case of the induction principle says (in English):
- "for all values x1...xn of types a1...an, if P holds for each of the inductive arguments (each xi of type t), then P holds for c x1 ... xn".

More on the induction Tactic

The induction tactic actually does even more low-level bookkeeping for us than we discussed above.

Recall the informal statement of the induction principle for natural numbers:

If P n is some proposition involving a natural number n, and we want to show that P holds for all numbers n, we can reason like this:
- show that P O holds
- show that, if P n' holds, then so does P (S n')
- conclude that P n holds for all n.

So, when we begin a proof with intros n and then induction n, we are first telling Coq to consider a particular n (by introducing it into the context) and then telling it to prove something about all numbers (by using induction).

What Coq actually does in this situation, internally, is to "re-generalize" the variable we perform induction on. For example, in our original proof that plus is associative...

Theorem plus_assoc' : ∀n m p : nat,
  n + (m + p) = (n + m) + p.
Proof.
  (* ...we first introduce all 3 variables into the context,
     which amounts to saying "Consider an arbitrary n, m, and
     p..." *)
  intros n m p.
  (* ...We now use the induction tactic to prove P n (that
     is, n + (m + p) = (n + m) + p) for _all_ n,
     and hence also for the particular n that is in the context
     at the moment. *)
  induction n as [| n'].
  Case "n = O". reflexivity.
  Case "n = S n'".
    (* In the second subgoal generated by induction -- the
       "inductive step" -- we must prove that P n' implies
       P (S n') for all n'.  The induction tactic
       automatically introduces n' and P n' into the context
       for us, leaving just P (S n') as the goal. *)
    simpl. rewrite → IHn'. reflexivity. Qed.

Induction Principles in Prop

Coq also automatically produces induction priniciples for inductively defined predicates. These differ just a little bit from the induction principles for data types: they are slightly less general than you might expect, but consequently easier to use:

Check gorgeous_ind.
(* ===>  gorgeous_ind
     : forall P : nat -> Prop,
       P 0 ->
       (forall n : nat, gorgeous n -> P n -> P (3 + n)) ->
       (forall n : nat, gorgeous n -> P n -> P (5 + n)) ->
       forall n : nat, gorgeous n -> P n *)

In English, gorgeous_ind says:

Suppose, P is a property of natural numbers (that is, P n is a Prop for every n). To show that P n holds whenever n is gorgeous, it suffices to show:
- P holds for 0,
- for any n, if n is gorgeous and P holds for n, then P holds for 3+n,
- for any n, if n is gorgeous and P holds for n, then P holds for 5+n.

As expected, we can apply gorgeous_ind directly instead of using induction.

Theorem gorgeous__beautiful' : ∀n, gorgeous n → beautiful n.
Proof.
   intros.
   apply gorgeous_ind.
   Case "g_0".
       apply b_0.
   Case "g_plus3".
       intros.
       apply b_sum. apply b_3.
       apply H1.
   Case "g_plus5".
       intros.
       apply b_sum. apply b_5.
       apply H1.
   apply H.
Qed.

The precise form of an Inductive definition can affect the induction principle Coq generates.

For example, in Logic, we have defined ≤ as:

(* Inductive le : nat -> nat -> Prop :=
| le_n : forall n, le n n
| le_S : forall n m, (le n m) -> (le n (S m)). *)

Note that the left-hand argument n to le is the same everywhere in the definition, so we can actually make it a "general parameter" to the whole definition, rather than an argument to each constructor.

Inductive le (n:nat) : nat → Prop :=
| le_n : le n n
| le_S : ∀m, (le n m) → (le n (S m)).

Notation "m ≤ n" := (le m n).

The second one is better, even though it looks less symmetric. Why? Because it gives us a simpler induction principle.

Check le_ind.
(* ===>  forall (n : nat) (P : nat -> Prop),
           P n ->
           (forall m : nat, n <= m -> P m -> P (S m)) ->
           forall n0 : nat, n <= n0 -> P n0 *)

Induction Principles for other Logical Propositions

Similarly, in Logic we have defined eq as:

(* Inductive eq (X:Type) : X -> X -> Prop :=
refl_equal : forall x, eq X x x. *)

In the Coq standard library, the definition of equality is slightly different:

Inductive eq' (X:Type) (x:X) : X → Prop :=
refl_equal' : eq' X x x.

The induction principle for eq' says, in effect, that if two things are equal and some property holds of one of them, then it holds of the other too ("Leibniz equality:").

Check eq'_ind.
(* ===>
     forall (X : Type) (x : X) (P : X -> Prop),
       P x -> forall y : X, x =' y -> P y

   ===>  (i.e., after a little reorganization)
     forall (X : Type) (x : X) forall y : X,
       x =' y ->
       forall P : X -> Prop, P x -> P y *)

Explicit Proof Objects for Induction

Recall the induction principle on naturals that Coq generates for us automatically from the Inductive declation for nat.

Check nat_ind.
(* ===>
   nat_ind : forall P : nat -> Prop,
      P 0 ->
      (forall n : nat, P n -> P (S n)) ->
      forall n : nat, P n  *)

There's nothing magic about this induction lemma: it's just another Coq lemma that requires a proof. Coq generates the proof automatically too...

Print nat_ind.
Print nat_rect.
(* ===> (after some manual inlining and tidying)
   nat_ind =
    fun (P : nat -> Prop)
        (f : P 0)
        (f0 : forall n : nat, P n -> P (S n)) =>
          fix F (n : nat) : P n :=
             match n with
            | 0 => f
            | S n0 => f0 n0 (F n0)
            end.
*)

We can read this as follows: Suppose we have evidence f that P holds on 0, and evidence f0 that ∀ n:nat, P n → P (S n). Then we can prove that P holds of an arbitrary nat n via a recursive function F (here defined using the expression form Fix rather than by a top-level Fixpoint declaration). F pattern matches on n:

If it finds 0, F uses f to show that P n holds.
If it finds S n0, F applies itself recursively on n0 to obtain evidence that P n0 holds; then it applies f0 on that evidence to show that P (S n) holds.

F is just an ordinary recursive function that happens to operate on evidence in Prop rather than on terms in Set.

We can adapt this approach to proving nat_ind to help prove non-standard induction principles too. Recall our desire to prove that

∀ n : nat, even n → ev n.

Attempts to do this by standard induction on n fail, because the induction principle only lets us proceed when we can prove that even n → even (S n) — which is of course never provable. What we did in Logic was a bit of a hack:

Theorem even__ev : ∀ n : nat, (even n → ev n) ∧ (even (S n) → ev (S n)).

We can make a much better proof by defining and proving a non-standard induction principle that goes "by twos":

Definition nat_ind2 :
    ∀(P : nat → Prop),
    P 0 →
    P 1 →
    (∀n : nat, P n → P (S(S n))) →
    ∀n : nat , P n :=
       fun P ⇒ fun P0 ⇒ fun P1 ⇒ fun PSS ⇒
          fix f (n:nat) := match n with
                             0 ⇒ P0
                           | 1 ⇒ P1
                           | S (S n') ⇒ PSS n' (f n')
                          end.

Once you get the hang of it, it is entirely straightforward to give an explicit proof term for induction principles like this. Proving this as a lemma using tactics is much less intuitive (try it!).

The induction ... using tactic variant gives a convenient way to specify a non-standard induction principle like this.

Lemma even__ev' : ∀n, even n → ev n.
Proof.
intros.
induction n as [ | |n'] using nat_ind2.
  Case "even 0".
    apply ev_0.
  Case "even 1".
    inversion H.
  Case "even (S(S n'))".
    apply ev_SS.
    apply IHn'. unfold even. unfold even in H. simpl in H. apply H.
Qed.

The Coq Trusted Computing Base

One issue that arises with any automated proof assistant is "why trust it?": what if there is a bug in the implementation that renders all its reasoning suspect?

While it is impossible to allay such concerns completely, the fact that Coq is based on the Curry-Howard correspondence gives it a strong foundation. Because propositions are just types and proofs are just terms, checking that an alleged proof of a proposition is valid just amounts to type-checking the term. Type checkers are relatively small and straightforward programs, so the "trusted computing base" for Coq — the part of the code that we have to believe is operating correctly — is small too.

What must a typechecker do? Its primary job is to make sure that in each function application the expected and actual argument types match, that the arms of a match expression are constructor patterns belonging to the inductive type being matched over and all arms of the match return the same type, and so on.

There are a few additional wrinkles:

Since Coq types can themselves be expressions, the checker must normalize these (by using the computation rules) before comparing them.
The checker must make sure that match expressions are exhaustive. That is, there must be an arm for every possible constructor. To see why, consider the following alleged proof object:

Definition or_bogus : ∀P Q, P ∨ Q → P :=
  fun (P Q : Prop) (A : P ∨ Q) ⇒
     match A with
     | or_introl H ⇒ H
     end.

All the types here match correctly, but the match only considers one of the possible constructors for or. Coq's exhaustiveness check will reject this definition.
The checker must make sure that each fix expression terminates. It does this using a syntactic check to make sure that each recursive call is on a subexpression of the original argument. To see why this is essential, consider this alleged proof:

Definition nat_false : ∀(n:nat), False :=
fix f (n:nat) : False := f n.

Again, this is perfectly well-typed, but (fortunately) Coq will reject it.

Note that the soundness of Coq depends only on the correctness of this typechecking engine, not on the tactic machinery. If there is a bug in a tactic implementation (and this certainly does happen!), that tactic might construct an invalid proof term. But when you type Qed, Coq checks the term for validity from scratch. Only lemmas whose proofs pass the type-checker can be used in further proof developments.