~bjoli/for-loops-docs

This is a pure syntax-case version of racket's for loops. It is mostly compatible, although the code generated is quite different. Some things are not supported, such as the body-or-break clause and sequence post-guards, but for most things they are equal. The code generated is not as complex, and some of the elegance when defining new for loop variants aren't there. For the vast majority of uses you won't notice any difference though.

#Rationale

I have always been a huge fan of srfi-42 (eager comprehensions), but I have always been sad that it 1. outputs code that is slower than a hand-rolled loop 2. outputs ugly code that depends on mutation. Trying Racket was a great feeling. The looping constructs were a lot more powerful than srfi-42. Sadly, we didn't have them in guile.

These for loops have zero, or very very close to zero runtime costs under guile 2.2. They have been written with a close eye to how guile 2.2's optimizer treats it's output, and the output of for/list is not far from how one would write a named let:

;; stupid named let loop, as a scheme programmer would write them 
(let loop ([acc '()] [i 100])
  (if (zero? i)
    (reverse! acc)
    (loop (cons i acc) (- i 1))))

;; is equivilent to:
(for/list ([i (in-range 100 0 -1)]) i)

;; Which expands to a hell of a lot of code (something like 30 lines)
;; which optimizes to this:
(reverse!
 (let loop ((acc '()) (container4326 100))
   (if (< container4326 1)
     acc
     (let ((acc (cons container4326 acc)))
       (loop acc (+ container4326 -1))))))

The code is a bit different, but wrap them both in a function and compare them with ,time (length (function n)) and you won't notice any difference, even for N's as large as 100 000 000.

Compared to the output of srfi-42 you notice one thing: no safety checks. It is up to the programmer to check the arguments.

#Installation

Put the loops folder in your site-dir, or add the path to where the loops folder is to your .guile file:

(add-to-load-path "path-to-where-loops-is")

after than you just do (import (loops for-loops)) and you are ready to go!

#Conventions

All loops are exported in two versions, one regular and one starified. The starified version inserts a #:when #t where no #:when or #:unless clauses are present. This means that it creates a sub-loop for each subsequent for-clause:

(for ([a '(1 2)] [b '(1 2)]) (display (cons a b)))
;; -> (1 . 1) (2 . 2)
(for* ([a '(1 2)] [b '(1 2)]) (display (cons a b)))
;; -> (1 . 1) (1 . 2) (2 . 1) (2 . 2)

There are a bunch of supported sequences. For maximum performance, they [need]{.underline} to be defined. There is one exception to this: literals are detected at compile time and wrapped in the correct sequence "creator". If the correct sequence type cannot be inferred at compile time (for example in case of an identifier), it will be detected at runtime at an additional runtime cost:

(define l (list 1 2 3 4))
;; bad, will incur a runtime cost:
(for ([a l]) (display a))

;; good, will be properly handled, and will make guile's optimizer very happy
;; producing a lot faster code
(for ([a (in-list l)]) (display a))

;; also good, supported literals (lists, vectors, strings, numbers) are auto-detected at compile time
(for ([a '(1 2 3 4)]) (display a)) 
#/incr and /decr variants

Ranges and integer-indexed sequences (vectors, bytevectors and strings) take at most three values: start stop step. If step is not known to the optimizer at compile time, it will have to check the step to see whether it is negative or positive to determine whether to use <= or >= to know when the iteration should stop. If you know in which direction the iteration is going, you can help the optimizer by specifying it, as in the following example where the step of (in-range) is unknown at compile time. The optimizer can then inline the call instead of depending on lambdas, which makes it a lot faster.

(define (erathostenes n)
  (let ([vec (make-vector n #t)])
    (for/list ([i (in-range 2 n)] #:when (vector-ref vec i))
      (for ([j (in-range/incr (* 2 i) n i)])
        (vector-set! vec j #f))
      i)))

For in-range there are two additional sequence specifiers defined: in-range/incr and in-range/decr. The same is true for all ranges and vector-like sequences.

#Caveats

Great care has been taken to make sure the code produced by these macros can be easily optimized by the guile optimizer and partial evaluator. Apart from specifying sequences mentioned above, there are some small caveats:

  • Benchmarking should be done with at least one non-literal, since guile's partial evaluator will sometimes optimize away the loop entirely and just keep the result
  • for/list is not multi-shot continuation safe. You must not save a continuation in the middle of a for loop and return to the same continuation more than once. Doing that will produce weird results. Use the safe, but slightly slower, for/list/pure for that.

Beware: The body-or-break clause of racket is not supported. When just copying racket code, this can be hard to spot (in fact, I got unexpected behaviour twice while trying out some of my old racket loops while writing this documentation). The reason for this is that guile's syntax case does not support matching keywords, which makes me have to jump through all sorts of hoops to support the break statements in the for-clauses

;; this perfectly valid racket code compiles and runs, but doesn't do what you think:
(for/and ([i '(1 2 3 4)])
  #:break (= i 3)
  i)

;; => 4 

;; instead, we move the break clause into the for-clauses, and voila!:
(for/and ([i '(1 2 3 4)] #:break (= i 3))
  i)
;; => 2

#Keyword arguments

All loops support four different keyword arguments. #:when, #:unless #:final and #:break. A keyword, or a group of keywords, all break out a subloop, like in racket.

##:break

Whenever a break expression returns true, it stops all iteration and returns any accumulator data if an accumulator is specified.

If you have more than one break expression in one loop, they are tested wrapped in (or ...), so only one of them has to be satisfied for the loop to exit

(for/list ([a '(1 2 3 4 5)] #:break (> a 3)) 
  a)
;; => (1 2 3)
##:final

Final statements behave like break-statements, but they are tested after the execution of the body. Compare this to the example above.

(for/list ([a '(1 2 3 4)] #:final (> a 3)) 
  a)
;; => (1 2 3 4)

If you have more than one #:final expression in one loop, they are tested wrapped in (or ...), so only one of them has to be satisfied for the loop to exit

##:when and unless

Any for-clauses after a when or unless-statement are looped through in a sub-loop. Whenever a when-statement returns truthy (or unless-statement false), the subloop is executed, otherwise the outer loop continues with it's next iteration:

(for/list ([a '(1 2 3 4)] #:when (odd? a) ) 
  a)
;; => (1 3)

(for/list ([a '(1 2 3 4)] #:when (odd? a) [b '(1 2)])
  (cons a b))
;; => ((1 . 1) (1 . 2) (3 . 1) (3 . 2))

If you specify several when- or unless-expressions in a row, they are tested wrapped in (and ...), i.e: they all have to be satisfied for the body to be executed.

#Loops

#for, for*

(for (for-clauses ...) body body* ...)

A looping construct only for side-effects. It returns nothing (really nothing) and using it where a return value is expected will make guile crash.

#for/fold for*/fold

(for/fold ([accumulator init-expr] ...) (for-clauses) body body* ...)

The fundamental list iterator. Each evaluation of the body has to return as many values as there are accumulators. Here is an example shamelessly taken from the racket documentation:

(for/fold ([sum 0]
           [rev-roots '()])
          ([i '(1 2 3 4)])
  (values (+ sum i) (cons (sqrt i) rev-roots)))
;; $1 = 10
;; $2 = (2 1.7320508075688772 1.4142135623730951 1)
#for/and for*/and

(for/and (for-clauses) body body* ...)

If/when the body of for/and evaluates to #f, iteration stops and #f is returned. Otherwise it returns the result of the last iteration. If no body is ever evaluated, #t is returned

(for/and ([l '(1 2 3 "error?")])
  (< l 3))
;; => #f because iteration stops before the string

(for/and ([l '(1 2 2)])
  (< l 3))
;; => #t
#for/or for*/or

(for/or (for-clause) body body* ...)

The reverse of for/and. Whenever the body returns non-false, stop iteration and return that value. If no body is ever evaluated, #f is returned.

(for/or ([i '(1 1 1 1)] [j '(2 2 2 2)])
  (= i j))
;; => #f

(for/or ([l '(#f #f #f 3)])
  l)
;; => 3
#for/sum for*/sum

(for/sum (for-clauses) body body* ...)

Sums the result of every iteration. If no body is evaluated, returns 0

(for/sum ([i '(1 2 3 4)])
  i)
;; => 9

(for/sum ([l '(9 16 25)])
  (sqrt l))
;; => 12
#for/product for*/product

(for/product (for-clause ...) body body* ...)

(for/product ([i '(2 2 2 2)])
  i)
;; => 16

Multiplies the result of each iteration with the product of every previous iteration. If no body is evaluated, returns 1

#for/list for*/list

(for/list (for-clause ...) body body* ...)

WARNING: for/list is not multi-shot continuation safe. It uses a reverse! at the end (for speed and memory usage). For multi-shot continuations, use for/list/pure. Read more about the reasoning behind this in the FAQ.

Probably the most useful for-loop. Iterates through the for-clauses and conses all results into a list. If no body is ever evaluated for/list returns '()

(for/list ([a '(1 2 3)] [b "abc"])
  (cons a b))

;; => ((1 . #\a) (2 . #\b) (3 . #\c))

(for*/list ([a '(1 2)] [b "abc"])
  (cons a b))
;; => ((1 . #\a) (1 . #\b) (1 . #\c) (2 . #\a) (2 . #\b) (2 . #\c))
#for/list/pure for*/list/pure

(for/list/pure (for-clauses ...) body body* ...)

A version of for/list that uses a regular reverse at the end. This is slower, but continuation safe (i.e: you can save the continuation and return to it several times).

#for/list/parallel

(for/list/parallel (for-clauses ...) body body* ...)

Like for/list, but executes all bodies in parallel using guile's high level threading features (futures). This is useful for heavy computations. If no body is ever evaluated, for/list/parallel returns '()

;; really slow fibonacci implementation, and also not a good example since it is very cache-heavy
;; but it illustrates the use case pretty well. There are other examples where you might get 
;; a more linear speedup.
(define (fib n)
  (if (<= n 2)
      1
      (+ (fib (- n 1)) (fib (- n 2)))))



;; I have 4 cores (with HT), so this runs 8 threads in parallel
;; Using the repl:

> ,time (for/list/parallel ([a '(40 40 40 40 40 40 40 40 40)]) (fib a))
;; $83 = (102334155 102334155 102334155 102334155 102334155 102334155 102334155 102334155 102334155)
;; 19.735858s real time, 108.121914s run time.  0.000000s spent in GC.

> ,time (for/list ([a '(40 40 40 40 40 40 40 40 40)]) (fib a))
;; $84 = (102334155 102334155 102334155 102334155 102334155 102334155 102334155 102334155 102334155)
;; 61.512955s real time, 61.510654s run time.  0.000000s spent in GC.

As you can see, the speedup isn't linear. For other tasks (like i/o) it might be. All the usual caveats of multithreaded programming apply.

#for/vector for*/vector

(for/vector [#:length integer-exp [#:fill fill-exp]] (for-clause ...) body body* ...)

This returns a vector. One might be tempted to think that it is faster than wrapping for/list in a list->vector, but that is not always the case in guile. If you don't specify a length, for/vector will dynamically grow the vector with a factor of 2 starting at size 16. This will cause a lot of copying and might lead to bad memory/GC performance. Specifying a length will give you better performance (both memory and speed) than for/list, but if what you want is a vector of an unspecified length, using for/list is actually faster.

The code produced by for/vector runs considerably faster than for/list in chez scheme, so there is probably room for improvement in what the macros can do to make guile optimize it better.

Specifying a #:length expression will give you a vector of that length. If you specify an additional #:fill, any remaining vector fields will be filled with that value. #:fill defaults to 0

(for/vector #:length 2 ([a '(1 1 1 2 4)] #:when (even? a))
  a)
;; => #(2 4)


(for/vector #:length 5 #:fill #f ([a '(1 2 3)])
  a)
;; => #(1 2 3 #f #f)
#for/first for*/first

(for/first (for-clauses) body body* ...)

Stops whenever the first body has been evaluated. If no body is evaluated, returns #f.

(for/first ([a '(1 1 1 2 4)] #:when (even? a))
  a)
;; => 2


(for/first ([a '(1 1 1 1)] #:when (even? a))
  a)
;; => #f
#for/last for*last

(for/last (for-clause ...) body body* ...)

Returns the result of the body of the last iteration. If no body is ever evaluated, return #f

(for/last ([a '(1 2 3 4)])
  a)
;; => 4


(for/last ([a '(1 2 3 4)] #:break (> a 3))
  a)
;; => 3

l

#for/stream for*/stream

(for/stream (for-clause ...) body body* ...)

Available in the (loops seq stream). Creates a srfi-41 stream.

#for/foldr for*/foldr

(for/foldr ((fold-var init-expr) ... delay-clauses ...) for-clauses body ...)

for/foldr is somewhat of a special case. Whereas all other loops create properly tail-call-optimizable loops, for/foldr does not. This makes it less usable for some things, but together with the delay-clauses it can be used to iterate lazily. Whereas for/list is expressed using a non-linear reverse! at the end, it could just as well be expressed as a foldr, removing any need to use reverse:

(for/foldr ((fold-var '())) ((b (in-range 1 5))) 
  (cons (* b b) fold-var))
;; => (1 4 9 16)

The above code is compiled to:

(let loop ((container28041 1))
  (if (>= container28041 5)
    '()
    (let ((fold-var (loop (+ container28041 1))))
      (cons (* container28041 container28041) fold-var))))

To further make for/foldr usable, it allows for delaying execution using delay-clauses. This is either using the keyword #:delay or by specifying #:delay-as and any of the following: thunk. (more are to be added). Using only #:delay the next iteration is delayed using promises. The next iteration is wrapped in a promise, so you have to (force ...) any fold vars to continue the iteration. If you are using multiple fold-vars #:delay is so far the only supported keyword. Delaying as thunk allows the inliner to inline the expression, which may lead to better performance, but also to [at best]{.underline} quadratic runtime if you use the thunk many times in the loop body.

The above list example converted to #:delay-as thunk:

(for/foldr ((fold-var '()) #:delay-as thunk) ((b (in-range 1 5))) 
  (cons (* b b) (fold-var)))
;; => (1 4 9 16)

The above code is compiled to:

(let loop ((container28041 1))
  (if (>= container28041 5)
    '()
     (cons (* container28041 container28041) (loop (+ container28041 1)))))

which in this particular case doesn't have any impact on run-time, but might lead to other cases where it is faster.

#Sequences

If you are not iterating over a literal list, vector, integer or hash you should wrap whatever you want to iterate over in a sequence-expression. A sequence expression is simply a macro that expands into five values. The first is a procedure to get the first element, the second is a procedure to move on in the sequence, the third is a predicate function that returns true when the sequence has stopped, the fourth is the start value and the fifth is something called a pre-check. The pre-check is a predicate that checks the value produced by the first procedure. If it returns true, the outer loop will be called. If the predicate returns true for an outer loop, it will end the loop.

In case of (in-list '(1 2 3)) that would mean (values car cdr null? '(1 2 3) (lambda (val) #f))

#Conventions

vector-like sequences (strings, vectors, bytevectors) have an extra procedure to iterate through them in reverse. In case of vectors, this is called in-reverse-vector.

(for ([a (in-reverse-string "a b c")])
  (display a))
;; => prints: c b a
#in-list in-circular-list

for iterating over lists. in-circular-lists is a slightly faster version for lists that are circular (since no null checks are needed).

#in-vector in-reverse-vector

(in-vector vector [start [end [step]]]) (in-vector/incr vector start end step) (in-vector/decr vector start end step) (in-reverse-vector vector)

for iterating over vectors.

#in-string in-reverse-string

(in-string string [start [end [step]]]) (in-string/incr string start end step) (in-string/decr string start end step) (in-reverse-string string)

for iterating over strings

#in-range

(in-range stop) (in-range start [end [step]]) (in-range/incr start end step) (in-range/decr start end step)

The first form is the same as (in-range 0 stop 1). Unless specified the default value for start is 0 and step is 1.

If you provide a step, but it is not visible to the optimizer at compile time, the optimizer cannot properly optimize in-range. Read more about this in the "conventions" section.

#in-naturals

(in-naturals [n])

An infinite sequence of natural numbers starting from 0 or, if specified, n. Is faster than in-range.

#in-hash

for iterating over hashes. in-hash returns two values, so you will have to specify two identifiers

(for ([(key value) (in-hash my-hash)])
  (display "key:   ") (display key) (newline)
  (display "value: ") (display value) (newline)) 
#in-hash-pairs

Iterates over the hash as well, but returns cons pairs of (key . value). This is a bit faster than in-hash.

#in-value

Iterates over a single value. Useful for for*-styled loops.

#in-port

(in-port port proc)

In port creates a sequence that runs proc on port until port-eof? returns true.

#in-lines

(in-lines textual-port)

Is the same as (in-port textual-port get-line), where get-line is get-line from (rnrs io ports (6))

#in-chars

(in-chars textual-port)

The same as (in-port textual-port get-char).

#in-generator

(in-generator generator)

Iterates over srfi-125-styled generators. These generators return #eof when no more objects are available.

#in-stream

(in-stream str)

This sequence is avaliable in (loops seq streams) and loops through a stream.

#bytevectors

The module (loops seq bytevectors) contains in-bytevector, in-reverse-bytevector (for u8 bytevectors) and in-s8-byevector and in-reverse-s8-bytevector for signed 8 bit bytevectors. They work as in-vector

#Defining your own

Defining your own sequence type is simple. I provide (define-sequence name val next stop start [pre-check]).

As said above, the in-list macro expands to five values. (in-list lst) becomes (values car cdr null? lst (lambda (val) #f). For more advanced sequences, you can of course expand to lambdas with closures. In range does this, but the guile inliner is smart enough to break out the closure and inline the lambdas.

What you need to do after defining a sequence type is to add it to the collection of recognized sequences, otherwise you will not get the desired results, since the loops will just wrap it in a in-value. This is done with the add-sequence-type and add-sequence-type* procedures.

add-sequence-type takes a symbol with the macro name, and add-sequence-type* takes many symbols.

In the following example we define an in-alist macro

(define-sequence (in-alist lst)
  (lambda (a) (values (caar a) (cdar a)))
  cdr
  null?
  lst)

(add-sequence-type 'in-alist)
(for ([(key elem) (in-alist '((hej . hopp) (haj . hoj)))]) 
  (display (list key elem)))
;; => (hej hopp) (haj hoj)

#Porting

This library should be fairly easy to port to any scheme that has syntax-case. There is little in this library that is guile-specific, and I suspect the hardest part will be the slight differences in how syntax-case behaves over different implementations. I thought I should mention the obvious parts:

#keywords

The macros currently use keywords for when,break,final and unless-statements in the for-clauses and in for/vector. If your scheme does not support keywords, you can either define it as identifier-syntax or just regular auxilary syntax.

#parse-ids

parse-ids uses a guile-specific optional argument syntax. Any optional arguments in guile default to #f, thus this should be trivial to implement as a case-lambda.

#for/vector

for/vector depends on vector-copy and vector-copy! from srfi-43. They are trivial to re-implement.

#for/list/parallel

uses guile's (ice-9 futures) which provide a thread-queue based future implementation. This will be hard to port, and has been put in it's own file and can be easily omitted.

#in-lines

In lines is implemented using get-line from (rnrs io ports). If your scheme provides a different procedure to read a line, it is a trivial change.

#FAQ

#Why MPL?

I use MPLv2 for all my projects. I like copyleft licences, and I agree very much with the FSF on many things. There are some questions about the LGPL and linking, so instead of having people ask questions, I much more like the idea of people statically linking to their hearts desire and being forced to open source anything only if they actually make any changes to the loop code itself.

I want people to be able to drop a copy of my code into their repository and use it without having to think much first. And I very much like the loose demands on acknowledgements of the MPLv2. Not having to "reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution" really lowers the barrier to use for silly small scripts.

If any project I sympathise with would like to use this code, I can of course make some kind of license changes or even give copyright away.

#Why isn't for/list multi-shot continuation safe?

Because it uses a linear update reverse! at the end. The options considered were:

  1. Make the update non-linear, which I chose not to because I believe that the standard case should be the fastest and that multi-shot continuations in a loop is a special case, and should be supported as such.
  2. Do what guile's map does: Since guile has no stack overflows, making the loops non-tail-recursive was an option. The benefit of this is that we wouldn't need to reverse the list at the end, nor would we need a special case for multi-shot continuations. At least in theory. In practice, however, re-instating a large continuation of cons-es is very slow, and we would have needed a special case for continuations anyway, even one-shot continuations.
  3. Have a special case for multi-shot continuations, state it clearly in the docs, and leave the rest of the world to use the fast version by default.

I went with 3, and as long as you don't do the following, you are safe:

(import (ice-9 control)
        (loops for-loops))

(define c #f)
(% (for/list ([a (in-list '(1 2 3 4 5 6))])
                    (when (= 3 a) (abort))
                    a)
 (lambda (k)
   (set! c k)))

(c) ;; -> (1 2 3 4 5 6)
(c) ;; -> (6 5 4 3 2 3 4 5 6)

However, with for/list/pure that works as expected.

#Why are my loops slow?

Have you forgotten the sequence type? Are you using in-hash? If you reply no to both of those, look at the output of ,opt. Is that doing something funny? Under guile2.2 and 3.0, the loops should be just as fast as hand-rolled loops. If the output of ,opt is funny, please file a bug report.

#Acknowledgements

First of all, racket showed the way! I didn't look at the source code (and frankly, I don't think I would understand much), but had more than one look at the code produced by the macros. The macro expander in DrRacket is a great tool,

Secondly, I would like to thank stis and ArneBab in #guile @ freenode.net. Stis for helping me making the macros faster, and ArneBab for helpful discussion. stis has his own implementation of for loops using syntax-parse. The source looks very fancy. You should have a look: https://gitlab.com/tampe/guile-for. I would have loved to use syntax parse for these macros. Just wow.

About this wiki

commit 9ece3b712f851c869faa457ec19d61bbe5361185
Author: Linus Björnstam <linus@Linuss-iMac.local>
Date:   2020-01-23T11:58:53+01:00

Unbork org export \'
Clone this wiki
https://git.sr.ht/~bjoli/for-loops-docs (read-only)
git@git.sr.ht:~bjoli/for-loops-docs (read/write)