Playing with Coq

Coq is interactive theorem prover, that I have been playing around a bit recently. This blog post shows how very basic function can be defined in Coq and then moved over to Haskell.

Word of warning, I know very little of Coq as of writing this post. So I might be doing some silly things. But in any case, I’m having fun with this and wanted to share it with you.

I wanted to write a piece of code that computes blood type based on two gene alleles (just like you might remember from elementary school biology class). Like always, I start with some types:

Inductive BloodTypeAllele : Type :=
  | BloodTypeA
  | BloodTypeB
  | BloodTypeO.

Inductive BloodType : Type :=
  | TypeA
  | TypeB
  | TypeAB
  | TypeO.

This defines two types BloodTypeAllele and BloodType. First one is used to represent gene that affects to blood type and second one is the actual blood type.

Mapping between them is defined as follows:

Definition bloodType (a b : BloodTypeAllele) : BloodType :=
  match a, b with
  | BloodTypeA, BloodTypeA => TypeA
  | BloodTypeA, BloodTypeO => TypeA
  | BloodTypeA, BloodTypeB => TypeAB
  | BloodTypeB, BloodTypeB => TypeB
  | BloodTypeB, BloodTypeA => TypeAB
  | BloodTypeB, BloodTypeO => TypeB
  | BloodTypeO, BloodTypeA => TypeA
  | BloodTypeO, BloodTypeB => TypeB
  | BloodTypeO, BloodTypeO => TypeO
  end.

There’s two alleles, one from each strand of DNA. Together they decide blood type. Since BloodTypeO allele is recessive, person needs to have two of them for them to have TypeO blood type.

Since Coq is interactive theorem prover, we can prove some properties of this code. For example, following states a theorem called double_O_results_O_type that states that function bloodType applied to BloodTypeO and BloodType has value of TypeO. It also proves it:

Theorem double_O_results_O_type :
  bloodType BloodTypeO BloodTypeO = TypeO.
Proof.
  reflexivity.
  Qed.

More interesting theorem and proof is following:

Theorem not_double_O_does_not_result_O_type :
  forall (b1 b2 : BloodTypeAllele),
  b1 <> BloodTypeO \/ b2 <> BloodTypeO ->
  bloodType b1 b2 <> TypeO.
Proof.
  intros.
  destruct b1.
  - destruct b2.
    + discriminate.
    + discriminate.
    + discriminate.
  - destruct b2.
    + discriminate.
    + discriminate.
    + discriminate.
  - destruct b2.
    + discriminate.
    + discriminate.
    + destruct H.
      * simpl. contradiction.
      * simpl. contradiction.
Qed.

Theorem not_double_O_does_not_result_O_type states that for all BloodTypeAllele combinations, when b1 is not equal to BloodTypeO or b2 is not equal to BloodTypeO, it implies that bloodType b1 b2 is not equal to TypeO. In other words, you can’t get TypeO blood type if you don’t have two BloodTypeO alleles.

Mathematician could write this as: ∀ b1 b2, b1 ≠ BloodTypeO ∨ b2 ≠ BloodTypeO → bloodType b1 b2 ≠ TypeO.

So, this if all fine and nice. I can write a function and prove some properties about it mathematically. This means that I don’t have to craft test cases by hand or use tool like QuickCheck to test it. Results of course depend a lot on quality of theorems I state. Afterall, if initial specification is incorrect, no tool can fix it.

Final neat trick is extracting these definitions to Haskell. Assuming that the code above is in module called Genes, we can add following to the source file:

Extraction Language Haskell.
Extraction Genes.

This instructs compiler to extract Haskell code for us, which results following:

data BloodTypeAllele =
   BloodTypeA
 | BloodTypeB
 | BloodTypeO

data BloodType =
   TypeA
 | TypeB
 | TypeAB
 | TypeO

bloodType :: BloodTypeAllele -> BloodTypeAllele -> BloodType
bloodType a b =
  case a of {
   BloodTypeA -> case b of {
                  BloodTypeB -> TypeAB;
                  _ -> TypeA};
   BloodTypeB -> case b of {
                  BloodTypeA -> TypeAB;
                  _ -> TypeB};
   BloodTypeO ->
    case b of {
     BloodTypeA -> TypeA;
     BloodTypeB -> TypeB;
     BloodTypeO -> TypeO}}

Not quite what I would have written by hand (especially the bloodType function). Still, type signatures are what I want them to be and bloodType has two properties proven about it mathematically. I’ll continue learning about Coq and exploring what I can do with it. There’s couple good resources I have been using. One is Software Foundations and another (that I haven’t had time to read, but that has been recommended me) is Coq in a Hurry.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s