Computing primes

Seems that recently I have been tinkering a lot with various kinds of numbers and sequences of numbers. This entry doesn’t differ from previous ones in that respect, as I’m going to show two straightforward ways of computing prime numbers.

“A prime number (or a prime) is a natural number greater than 1 that has no positive divisors other than 1 and itself.” (thanks Wikipedia). One well known way of computing primes is sieve of Eratosthenes (I quite like the name, it’s evokes image of a mathematician sieving through big pile of numbers and trying to find some special numbers hidden among them). The two systems (they’re almost identical though) I’m showing are more or less identical to the sieve of Eratosthenes.

First one is build using potentially infinite sequences I have been writing about recently. Thus, it will build up a list of prime numbers that are computed only once and then retrieved from the internal cache. Skeleton for this algorithm is as follows:

(defseq primes [n]
  (cond [(= n 0) 2]
        [true (do (setv guess (next-possible-prime n))
                  (while (divisible? guess (previous-primes n))
                    (setv guess (inc guess)))
                  guess)]))

There are two branches. First branch is a special case that defines that first prime is 2. The second branch is used to compute subsequent primes in terms of previous ones. When we want to know prime in index 1 (namely, the second prime), we first set our guess to be one bigger than the previous prime we know. After that we check if the guess can be divided by any of the previous prime numbers we have collected. If so, guess is incremented by one and we repeat the check for divisibility. As soon as we find a number that is not divisible by any primes we have found before, we have a new prime in our hands and return it.

Helper functions that this algorithm uses are rather short and could have been in the body of the algorithm (actually, they originally were there). But I thought that by separating them from the main body and giving them proper names, the intent of the code would be clearer and easier to understand.

(defn next-possible-prime [n]
  (inc (get primes (dec n))))

(defn divisible? [n prevs]
  (any (map (fn [x] (not (% n x))) prevs)))
  
(defn previous-primes [n]
  (take (dec n) primes))

I could also write the algorithm in recursive manner, but that would lead into exceeding maximum recursion limit. Hy offers tools for tackling just this kind of situation in contrib module: loop and recur:

(defseq primes [n]
  (cond [(= n 0) 2]
        [true (loop [[guess (next-possible-prime n)]]
                (if (divisible? guess (previous-primes n))
                    (recur (inc guess))
                    guess))]))

Of course if next-possible-prime, divisible? and previous-primes aren’t useful for rest of the program, we can just pack them in as nested function definitions:

(defseq primes [n]
  "infinite sequence of primes"

  (defn next-possible-prime [n]
    (inc (get primes (dec n))))
  (defn divisible? [n prevs]
    (any (map (fn [x] (not (% n x))) prevs)))
  (defn previous-primes [n]
    (take (dec n) primes))

  (cond [(= n 0) 2]
        [true (loop [[guess (next-possible-prime n)]]
                (if (divisible? guess (previous-primes n))
                    (recur (inc guess))
                    guess))]))

Using this sequence is simple. One can ask the string representation of it to quickly peek at first 10 primes, or do more involved tricks with drop, take and such:

=> primes
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, ...]

=> (list (take 10 (drop 50 primes)))
[233, 239, 241, 251, 257, 263, 269, 271, 277, 281]

But what about if we’re just interested in producing a stream of primes and aren’t interested in storing them? We can do this by building a generator that produces integers starting from 2 (since it’s the first prime) and then filtering the resulting stream. So when we ask first number, we get 2. Then we’ll add a filter to generator that filters out all numbers divisible by 2:

=> (defn numbers-seq []
...  "infinite sequence of integers starting from 2"
...  (setv i 2)
...  (while true
...    (yield i)
...    (setv i (inc i))))

=> (setv primes (numbers-generator))
=> (next primes)
2
=> (setv primes (filter (fn [x] (% x 2)) primes))

And then we just repeat the process. numbers-generator will product 3, which isn’t divisible by 2, so it goes through the pipeline:

=> (next primes)
3
=> (setv primes (filter (fn [x] (% x 3)) primes))

No we have gotten two primes out of the pipeline: 2 and 3. Next numbers-generator will produce 4, but since that’s divisible by 2, the first filter will reject it. Since the pipeline was asked to produce a number, numbers-generator is automatically instructed to generate a number 5 that goes through both filters:

=> (next primes)
5
=> (setv primes (filter (fn [x] (% x 5)) primes))

We can keep repeating this process until our fingers hurt and we think that there must be a more intelligent way of doing this. Instead of doing the repetitive task of building the pipeline of filters, we can let the computer to take care of it. First we need a function that when given a generator, will take the first element from it, construct a new generator that will filter numbers divisible by that number and return a tuple containing the number and new generator:

(defn next-prime [gen]
  "get next prime and new filter for primes after that one"
  (setv res (next gen))
  (, res (filter (fn [x] (% x res)) gen)))

Final step is to assemble these two parts together with a main loop that will yield prime numbers forever:

(defn prime-generator []
  "generator for prime numbers"
  (setv numbers (numbers-seq))
  (while true
    (setv (, res numbers) (next-prime numbers))
    (yield res)))

With this system, we can access every prime only once, after which it will be forever gone. And like with the previous system, you really don’t want to try and get each and every prime in it:

=> (list (take 10 (prime-generator)))
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

This system doesn’t store previously computed primes in a way that they would be accessible at later time (not at least without some really involved hacking of filters in the pipeline). But they’re still present in the filters, so over the time this system will slowly grow to consume more and more memory. But I quite like how the pipeline is slowly extended to filter more and more numbers away from the stream.

Silly tinkering like this is nice way of testing new concepts and trying out new things with old tools. Sometimes it even highlights corner cases in tools that need some extra attention. For example, while writing this, I realized that defseq only allowed single form as a body for computing the sequence. Normally this would be enough, but as soon as you want to add docstrings or internal functions, it breaks down. It just isn’t possible to do that cleanly (sure, you can wrap everything in do, but that’s ugly. Better let the macro to do that implicitly and concentrate on more important matters.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s