Lazy sequences in Hy – redux

Not long ago I wrote about mathematical patterns with Hy. One central point of the post was lazy sequences that only evaluated as many elements as requested and no more. The code presented covered the basics, but several features were lacking. Namely finite sequences, calculating length and accessing negative indexes. This post builds on previous one and addresses the most glaring problems.

Original code used dictionary to hold calculated values. While this works, it makes accessing negative indexes more complicated. Likewise calculating length of the sequence is more complicated than it needs to be. In order to address that, I replaced dictionary with just a regular list. At the same time, pre-computing the first element was removed.

For finite sequences I had couple of ideas, but in the end I ended up adding –iter– magic method:

[--iter-- (fn [self]
            "create iterator for this sequence"
            (setv index 0)
            (try (while true
                   (yield (get self index))
                   (setv index (inc index)))
                 (except [_ IndexError]
                   (raise StopIteration))))]

The method creates iterator by starting from the beginning of the sequence and yielding items until IndexError is raised. When IndexError is raised, StopIteration is raised, which signals Python that the iteration has ran to it’s completion. end-sequence function can be used in sequence definition to signal end of the sequence (and access outside of the bounds of the sequence in general).

To define a simple finite sequence, one can write it as following:

(defseq shorty [n]
  (cond [[(< n 10) n]
         [true (end-sequence)]]))

This will create sequence of [0 1 2 3 4 5 6 7 8 9].

Having defined this finite sequence, I wanted to know length of it. This is implemented as –len– magic method. This is the method that is called when len function is used to find the length of any list or similar structure. To implement this, I wrote code that first makes sure that the sequence has been fully computed and then simply calls len for the internal cache (one of the perks of switching from dictionary to list):

[--len-- (fn [self]
           "length of the sequence, dangerous for infinite sequences"
           (setv index (. self high-water))
           (try (while true          
                  (get self index)
                  (setv index (inc index)))
                (except [_ IndexError]
                  (len (. self cache)))))]

The problem with this approach is that if end-sequence is never called, the computation will continue until forcibly stopped or until the computer runs out of memory. I briefly entertained an algorithm that would detect if sequence will ever complete, but abandoned it really soon. Much smarter people than me have pondered that problem and it isn’t solvable.

=> (len shorty)

Support for negative indexes was almost trivial. Since cache is list in itself, the code just needs to make sure that the sequence has been fully computed and then fetch the negative index from the cache. This is achieved with a little check in –getitem– magic method before accessing the cache:

(when (neg? n)
  (len self))

I’m not entirely happy that len is used for the side-effect. Cleaner way to implement this would be to add –force methdod that would force computing of the sequence and call that where needed.

Final flourish for the improved sequences was adding an indication in –str– if the sequence is longer than what is being displayed:

--str-- (fn [self]
          "string representation of this sequence"
          (setv items (list (take 11 self)))
          (.format (if (= (len items) 11)
                     "[{0}, ...]"
                   (.join ", " (map str items))))

This way shorty is shown as [0 1 2 3 4 5 6 7 8 9], whereas infinite version of similar sequence would be [0 1 2 3 4 5 6 7 8 9 …].

This concludes modifications for now. Now we can have finite sequences that support computing length and accessing negative indexes. I will let this brew a bit and see if the are other magic methods or special protocols that I could still implement to make sequences more interesting. I probably should also step outside of numbers and start tinkering with more interesting datatypes.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s