Jupyter + Archimedes

I have been playing with Jupyter and Calysto Hy recently. When it comes to testing (especially with the tools I’m used to), there’s couple hitches: nosetests aren’t executed automatically and any exception that a cell throws stops execution of whole notebook. Like before, this blog post was originally written with Jupyter, because it just makes so easy to interleave code and prose, while making sure that the code is actually runnable.

But, lets get started and define rudimentary version of our check macro that will automatically execute defined fact.

(require archimedes)

(defmacro check [fact-name &rest code]
          "define and execute a fact"
          `(do ((fact ~fact-name ~@code))
               (+ "ok: " ~fact-name)))

That takes care of the first problem. Now any check defined will automatically be executed (and thus verified). Lets check how this works.

(import [hamcrest [assert-that is- equal-to]])

(check "1 is 1"
       (assert-that 1 (is- (equal-to 1))))
'ok: 1 is 1'

So far so good, but now for the hard part: dealing with and reporting exceptions. As previously mentioned, any exception being raised from a cell will stop execution of rest of the notebook. Thus, we need to catch exceptions. Here’s a revised version of our check macro.

(defmacro check [fact-name &rest code]
          "define and execute a fact"
          `(try ((fact ~fact-name ~@code))
                (catch [e Exception] (print (+ "Failure: " ~fact-name)))
                (else (print (+ "Ok: "  ~fact-name)))))

(check "1 is 2"
       (assert-that 1 (is- (equal-to 2))))
Failure: 1 is 2

Instead of returning a string indicating the result (and getting nice In/Out descriptions on the notebook), I chose explicitly print out what we want our test case to report. This of course messes up Jupyter In-Out cell rendering a bit, but for our purposes this drawback is rather necessary. Especially when we later start to deal with error messages that contain multiple lines.

While check macro doesn’t stop execution of the whole notebood anymore, the error being reported isn’t particularly neat. It only tells name of the test that failed, but doesn’t indicate why it failed. Luckily AssertionError has description that we can use just for these occasions.

(defmacro check [fact-name &rest code]
          "define and execute a fact"
          `(try ((fact ~fact-name ~@code))
                (catch [e Exception] (print (+ "Failure: " ~fact-name (str e))))
                (else (print (+ "Ok: " ~fact-name)))))

(check "1 is 2"
       (assert-that 1 (is- (equal-to 2))))
Failure: 1 is 2
Expected: <2>
     but: was <1>

Of course we’re banking that whoever is raising the exception is actually using AssertionError and putting some sensible description there. However, this might not always be the case, especially when the execution doesn’t even get to the point of verification, but thrown some unrelated exception earlier. Lets simulate the situation and see how does that look like.

(check "funny errors are still reported"
       (assert False))
Failure: funny errors are still reported

Not that informative, but at least the system didn’t crash. And to be honest, there aren’t that much of information our macro could display in that case. Maybe type of the exception being raised, if nothing else? Lets try supplying an error message.

(check "assertion error with sensible explanation looks nicer"
       (assert False "this was intentional failure"))
Failure: assertion error with sensible explanation looks nicerthis was intentional failure

Ugh, pretty ugly. There should be a linefeed in the beginning, so the error would be more readable. Lets try different type of exception still.

(check "index error is reported"
       (raise (IndexError "a was out of bounds")))
Failure: index error is reporteda was out of bounds

We can get around this by adding an extra linefeed when needed to. This was all checks will report in the same way, regardless of the type of the exception being raised and if there is no explanation, explanation without extra linefeed or explanation with extra linefeed.

(defmacro check [fact-name &rest code]
          "define and execute a fact"
          `(try ((fact ~fact-name ~@code))
                (catch [e Exception] 
                       (do (setv desc (str e))
                           (if (and (> (len desc) 0)
                                    (!= (first desc) "\n"))
                               (setv desc (+ "\n" desc)))
                           (print (+ "Failure: " ~fact-name desc))))
                (else (print (+ "Ok: " ~fact-name)))))

(check "index error is reported"
       (raise (IndexError "a was out of bounds")))
Failure: index error is reported
a was out of bounds
(import [hamcrest [greater-than less-than all-of]])

(check "number ordering"
      (assert-that 2 (all-of (greater-than 1)
                             (less-than 2))))
Failure: number ordering
Expected: (a value greater than <1> and a value less than <2>)
     but: a value less than <2> was <2>
(check "plain exception without description is ok too"
       (raise (IndexError)))
Failure: plain exception without description is ok too
(check "positive cases are still working"
       (assert-that 1 (is- (equal-to 1))))
Ok: positive cases are still working

Now that I made (relatively) sure that this thing is working with basic tests, I want to test it a bit with Hypothesis. Hypothesis is a fuzzing library that I sometimes use to write more generalized tests. Idea is that I can specify what kind of data to feed to my function and what kind of outcome there should be. Hypothesis then generates random data and reports if it finds a case that breaks the test.

(import [hypothesis.strategies [integers]])

(check "sum of two positive integers is larger than either one of them"
       (variants :a (integers :min-value 1)
                 :b (integers :min-value 1))
       (assert-that (+ a b)
                    (all-of (greater-than a)
                            (greater-than b))))
Ok: sum of two positive integers is larger than either one of them

Positive case seems to be working (or at least it’s not reporting anything). This of course isn’t enough, so lets write a test for a malfunctioning implementation of adding two numbers together. In order to make things a bit harder for Hypothesis, we write our adding function in a way that it’s not always broken, only in very specific instances.

(defn faulty+ [a b]
      "add two numbers"
      (if (odd? a)
          (+ a b)
          (- a b)))

(check "sum of two positive integers is larger than either one of them"
       (variants :a (integers :min-value 1)
                 :b (integers :min-value 1))
       (assert-that (faulty+ a b)
                    (all-of (greater-than a)
                            (greater-than b))))
Falsifying example: test_sum_of_two_positive_integers_is_larger_than_either_one_of_them(a=2, b=1)
Failure: sum of two positive integers is larger than either one of them
Expected: (a value greater than <2> and a value greater than <1>)
     but: a value greater than <2> was <1>

Failure works too, although it doesn’t look particularly pretty. Generated function name is used instead of the descriptive test (as expected, as Hypothesis has no idea about Archimedes) and falsifying example is the first thing printed. I would have liked it to appear underneath failure, but before assertion report. I suspect there’s no way of fixing that, at least not without modifying innards of Hypothesis. But this is pretty enough for us for now.

I expect the check macro end up in Archimedes eventually, along with couple other improvements. It’s not yet available in PyPi, although one can get a snapshot from GitHub repository and try it out. One of the improvements I want to make depends on upcoming feature of Hy. I propably will release Archimedes 0.0.1 shortly after release of Hy then.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s