Coq is interactive theorem prover, that I have been playing around a bit recently. This blog post shows how very basic function can be defined in Coq and then moved over to Haskell.

Word of warning, I know very little of Coq as of writing this post. So I might be doing some silly things. But in any case, I’m having fun with this and wanted to share it with you.

I wanted to write a piece of code that computes blood type based on two gene alleles (just like you might remember from elementary school biology class). Like always, I start with some types:

Inductive BloodTypeAllele : Type := | BloodTypeA | BloodTypeB | BloodTypeO. Inductive BloodType : Type := | TypeA | TypeB | TypeAB | TypeO.

This defines two types *BloodTypeAllele* and *BloodType*. First one is used to represent gene that affects to blood type and second one is the actual blood type.

Mapping between them is defined as follows:

Definition bloodType (a b : BloodTypeAllele) : BloodType := match a, b with | BloodTypeA, BloodTypeA => TypeA | BloodTypeA, BloodTypeO => TypeA | BloodTypeA, BloodTypeB => TypeAB | BloodTypeB, BloodTypeB => TypeB | BloodTypeB, BloodTypeA => TypeAB | BloodTypeB, BloodTypeO => TypeB | BloodTypeO, BloodTypeA => TypeA | BloodTypeO, BloodTypeB => TypeB | BloodTypeO, BloodTypeO => TypeO end.

There’s two alleles, one from each strand of DNA. Together they decide blood type. Since *BloodTypeO* allele is recessive, person needs to have two of them for them to have *TypeO* blood type.

Since Coq is interactive theorem prover, we can prove some properties of this code. For example, following states a theorem called *double_O_results_O_type* that states that function *bloodType* applied to *BloodTypeO* and *BloodType* has value of *TypeO*. It also proves it:

Theorem double_O_results_O_type : bloodType BloodTypeO BloodTypeO = TypeO. Proof. reflexivity. Qed.

More interesting theorem and proof is following:

Theorem not_double_O_does_not_result_O_type : forall (b1 b2 : BloodTypeAllele), b1 <> BloodTypeO \/ b2 <> BloodTypeO -> bloodType b1 b2 <> TypeO. Proof. intros. destruct b1. - destruct b2. + discriminate. + discriminate. + discriminate. - destruct b2. + discriminate. + discriminate. + discriminate. - destruct b2. + discriminate. + discriminate. + destruct H. * simpl. contradiction. * simpl. contradiction. Qed.

Theorem *not_double_O_does_not_result_O_type* states that for all *BloodTypeAllele* combinations, when *b1* is not equal to *BloodTypeO* or *b2* is not equal to *BloodTypeO*, it implies that *bloodType b1 b2* is not equal to *TypeO*. In other words, you can’t get *TypeO* blood type if you don’t have two *BloodTypeO* alleles.

Mathematician could write this as: ∀ b1 b2, b1 ≠ BloodTypeO ∨ b2 ≠ BloodTypeO → bloodType b1 b2 ≠ TypeO.

So, this if all fine and nice. I can write a function and prove some properties about it mathematically. This means that I don’t have to craft test cases by hand or use tool like QuickCheck to test it. Results of course depend a lot on quality of theorems I state. Afterall, if initial specification is incorrect, no tool can fix it.

Final neat trick is extracting these definitions to Haskell. Assuming that the code above is in module called *Genes*, we can add following to the source file:

Extraction Language Haskell. Extraction Genes.

This instructs compiler to extract Haskell code for us, which results following:

data BloodTypeAllele = BloodTypeA | BloodTypeB | BloodTypeO data BloodType = TypeA | TypeB | TypeAB | TypeO bloodType :: BloodTypeAllele -> BloodTypeAllele -> BloodType bloodType a b = case a of { BloodTypeA -> case b of { BloodTypeB -> TypeAB; _ -> TypeA}; BloodTypeB -> case b of { BloodTypeA -> TypeAB; _ -> TypeB}; BloodTypeO -> case b of { BloodTypeA -> TypeA; BloodTypeB -> TypeB; BloodTypeO -> TypeO}}

Not quite what I would have written by hand (especially the *bloodType* function). Still, type signatures are what I want them to be and *bloodType* has two properties proven about it mathematically. I’ll continue learning about Coq and exploring what I can do with it. There’s couple good resources I have been using. One is Software Foundations and another (that I haven’t had time to read, but that has been recommended me) is Coq in a Hurry.