I haven’t written anything in a while. But after seeing the metaprogramming video on Computerphile the other day I felt the urge to write about Lisp again, so I decided to do a small series of posts on fun/useful little Common Lisp macros I’ve made over the past couple of years.
The first macro we’ll look at is
(defmacro gathering (&body body) "Run `body` to gather some things and return a fresh list of them. `body` will be executed with the symbol `gather` bound to a function of one argument. Once `body` has finished, a list of everything `gather` was called on will be returned. It's handy for pulling results out of code that executes procedurally and doesn't return anything, like `maphash` or Alexandria's `map-permutations`. The `gather` function can be passed to other functions, but should not be retained once the `gathering` form has returned (it would be useless to do so anyway). Examples: (gathering (dotimes (i 5) (gather i)) => (0 1 2 3 4) (gathering (mapc #'gather '(1 2 3)) (mapc #'gather '(a b))) => (1 2 3 a b) " (alexandria:with-gensyms (result) `(let ((,result nil)) (flet ((gather (item) (push item ,result) item)) ,@body) (nreverse ,result))))
As the docstring mentions, sometimes you’ll encounter procedural code that iterates over things but doesn’t return any results (CL’s
maphash and Alexandria’s
map-permutations are two examples). The
gathering macro provides an easy way to plug into the guts of the iteration and get a list back. The docstring describes how to use the macro, but there’s a couple of extra things to note before we move on.
Default LET Bindings
I’m aware that the
(let ((,result nil)) ...) in the macro could be simplified to
(let (,result) ...) because bindings without an initial value default to
nil . I just personally dislike that style and prefer to be explicit about the initial values, even when they’re
flet ed function that actually does the gathering is named as the symbol
gather , not a gensym. Some folks will dislike that because it captures the function name. I don’t mind it. I consider that behavior part of the intended/documented API of the macro, and because the symbol
gather is coming from the macro’s package you’ll need to import it (or package qualify it) to access it.
If you want to provide a bit more safety here you could potentially define a vanilla function for
gather like this:
(defun gather (value) "Gather `value` into the result. Must be called from inside `gathering`." (error "GATHER be called from within a GATHERING macro."))
Doing this would mean calling
gather outside a
gathering block would signal an error, and someone defining their own
gather function (which could accidentally get shadowed by the macro’s
flet later) would get a warning about redefining the function, which would hopefully make them realize the potential conflict earlier.
Of course you could also write your own version of
gathering that takes in a symbol to use for the function, if you prefer. That would avoid all of these issues, at the cost of making every
(gathering ...) slightly more verbose.
My actual implementation of this macro uses a simple queue data structure instead of a list to avoid having to reverse the result at the end, but I didn’t want to include an extra dependency just for this blog post.
Feel free to skip this section. It gets a little bit into the weeds.
One potential optimization we could make is to declare the
gather function to have dynamic extent . This means that some implementations (e.g. SBCL) can stack allocate the closure and save a heap allocation.
(defmacro gathering-dynamic-extent (&body body) (alexandria:with-gensyms (result) `(let ((,result nil)) (flet ((gather (item) (push item ,result) item)) (declare (dynamic-extent #'gather)) ; NEW ,@body) (nreverse ,result))))
Whether this matters much depends on the usage patterns. If you use
gathering with some code that typically gathers no elements (and only occasionally gathers a few), you won’t have to heap allocate anything for the common cases, and things will be a little more efficient. If you typically gather a bunch of things then you’re heap allocating all the cons cells anyway, so allocating the closure wouldn’t be a big deal.
If you want to actually see the stack allocation in action, it might be a little trickier than you think. SBCL is pretty smart about inlining things, so it might not allocate the closure at runtime at all , even without the
(disassemble (lambda () (gathering (mapc #'gather '(1 2 3))))) ; disassembly for (LAMBDA ()) ; Size: 149 bytes. Origin: #x10040B8340 ; 40: 498B4C2460 MOV RCX, [R12+96] ; no-arg-parsing entry point ; thread.binding-stack-pointer ; 45: 48894DF8 MOV [RBP-8], RCX ; 49: BB17001020 MOV EBX, #x20100017 ; NIL ; 4E: 488B0D9BFFFFFF MOV RCX, [RIP-101] ; '(1 2 3) ; 55: EB4D JMP L3 ; 57: 660F1F840000000000 NOP ; 60: L0: 4881FA17001020 CMP RDX, #x20100017 ; NIL ; 67: 744A JEQ L4 ; 69: 488B71F9 MOV RSI, [RCX-7] ; 6D: 49896C2440 MOV [R12+64], RBP ; thread.pseudo-atomic-bits ; 72: 4D8B5C2418 MOV R11, [R12+24] ; thread.alloc-region ; 77: 498D5310 LEA RDX, [R11+16] ; 7B: 493B542420 CMP RDX, [R12+32] ; 80: 7746 JNBE L5 ; 82: 4989542418 MOV [R12+24], RDX ; thread.alloc-region ; 87: L1: 498D5307 LEA RDX, [R11+7] ; 8B: 49316C2440 XOR [R12+64], RBP ; thread.pseudo-atomic-bits ; 90: 7403 JEQ L2 ; 92: 0F0B09 BREAK 9 ; pending interrupt trap ; 95: L2: 488972F9 MOV [RDX-7], RSI ; 99: 48895A01 MOV [RDX+1], RBX ; 9D: 488BDA MOV RBX, RDX ; A0: 488B4901 MOV RCX, [RCX+1] ; A4: L3: 488BD1 MOV RDX, RCX ; A7: 8D41F9 LEA EAX, [RCX-7] ; AA: A80F TEST AL, 15 ; AC: 74B2 JEQ L0 ; AE: 0F0B0A BREAK 10 ; error trap ; B1: 2F BYTE #X2F ; OBJECT-NOT-LIST-ERROR ; B2: 10 BYTE #X10 ; RDX ; B3: L4: 488BD3 MOV RDX, RBX ; B6: B902000000 MOV ECX, 2 ; BB: FF7508 PUSH QWORD PTR [RBP+8] ; BE: B8B8733120 MOV EAX, #x203173B8 ; # ; C3: FFE0 JMP RAX ; C5: 0F0B10 BREAK 16 ; Invalid argument count trap ; C8: L5: 6A10 PUSH 16 ; CA: 41BB3000B021 MOV R11D, #x21B00030 ; ALLOC-TO-R11 ; D0: 41FFD3 CALL R11 ; D3: EBB2 JMP L1 NIL
Here SBCL knows what
mapc is and expands it inline (notice how there’s no function call to
mapc ). It then realizes it never needs to allocate a closure for
gather at all, it can just inline that too! All the allocation stuff in that disassembly is for the
push ing of new cons cells (to convince yourself of this, run
(disassemble (lambda (&aux result) (push 1 result) result)) and compare the assembly).
But if we make our own version of
(defun my-mapc (function list) (mapc function list))
Now we can see a difference, because SBCL won’t necessarily know it can stack allocate the closure unless we tell it. Here’s what the disassembly looks like without the
(disassemble (lambda () (gathering (my-mapc #'gather '(1 2 3))))) ; disassembly for (LAMBDA ()) ; Size: 212 bytes. Origin: #x1004331A20 ; 20: 498B4C2460 MOV RCX, [R12+96] ; no-arg-parsing entry point ; thread.binding-stack-pointer ; 25: 48894DF8 MOV [RBP-8], RCX ; 29: B817001020 MOV EAX, #x20100017 ; NIL ; 2E: 49896C2440 MOV [R12+64], RBP ; thread.pseudo-atomic-bits ; 33: 4D8B5C2418 MOV R11, [R12+24] ; thread.alloc-region ; 38: 498D5B10 LEA RBX, [R11+16] ; 3C: 493B5C2420 CMP RBX, [R12+32] ; 41: 0F874D010000 JNBE #x1004331B94 ; 47: 49895C2418 MOV [R12+24], RBX ; thread.alloc-region ; 4C: 498D5B0F LEA RBX, [R11+15] ; 50: 66C743F14101 MOV WORD PTR [RBX-15], 321 ; 56: 488943F9 MOV [RBX-7], RAX ; 5A: 49316C2440 XOR [R12+64], RBP ; thread.pseudo-atomic-bits ; 5F: 7403 JEQ L0 ; 61: 0F0B09 BREAK 9 ; pending interrupt trap ; 64: L0: 48895DE8 MOV [RBP-24], RBX ; 68: 49896C2440 MOV [R12+64], RBP ; thread.pseudo-atomic-bits ; 6D: 4D8B5C2418 MOV R11, [R12+24] ; thread.alloc-region ; 72: 498D4B20 LEA RCX, [R11+32] ; 76: 493B4C2420 CMP RCX, [R12+32] ; 7B: 0F8723010000 JNBE #x1004331BA4 ; 81: 49894C2418 MOV [R12+24], RCX ; thread.alloc-region ; 86: 498D4B0B LEA RCX, [R11+11] ; 8A: B835020000 MOV EAX, 565 ; 8F: 480B042508011020 OR RAX, [#x20100108] ; SB-VM:FUNCTION-LAYOUT ; 97: 488941F5 MOV [RCX-11], RAX ; 9B: 488D058E000000 LEA RAX, [RIP+142] ; = #x1004331B30 ; A2: 488941FD MOV [RCX-3], RAX ; A6: 49316C2440 XOR [R12+64], RBP ; thread.pseudo-atomic-bits ; AB: 7403 JEQ L1 ; AD: 0F0B09 BREAK 9 ; pending interrupt trap ; B0: L1: 48895905 MOV [RCX+5], RBX ; B4: 488BD1 MOV RDX, RCX ; B7: 488D4424F0 LEA RAX, [RSP-16] ; BC: 4883EC10 SUB RSP, 16 ; C0: 488B3DE9FEFFFF MOV RDI, [RIP-279] ; '(1 2 3) ; C7: B904000000 MOV ECX, 4 ; CC: 488928 MOV [RAX], RBP ; CF: 488BE8 MOV RBP, RAX ; D2: B838734F20 MOV EAX, #x204F7338 ; # ; D7: FFD0 CALL RAX ; D9: 480F42E3 CMOVB RSP, RBX ; DD: 488B5DE8 MOV RBX, [RBP-24] ; E1: 488B53F9 MOV RDX, [RBX-7] ; E5: B902000000 MOV ECX, 2 ; EA: FF7508 PUSH QWORD PTR [RBP+8] ; ED: B8B8733120 MOV EAX, #x203173B8 ; # ; F2: FFE0 JMP RAX NIL
Now we can see it’s allocating the closure on the heap (note the
If we add the
dynamic-extent back into
(disassemble (lambda () (gathering-dynamic-extent (my-mapc #'gather '(1 2 3))))) ; disassembly for (LAMBDA ()) ; Size: 136 bytes. Origin: #x10043F46E0 ; 6E0: 498B4C2460 MOV RCX, [R12+96] ; no-arg-parsing entry point ; thread.binding-stack-pointer ; 6E5: 48894DF8 MOV [RBP-8], RCX ; 6E9: 48C745E817001020 MOV QWORD PTR [RBP-24], #x20100017 ; NIL ; 6F1: 488BDC MOV RBX, RSP ; 6F4: 48895DE0 MOV [RBP-32], RBX ; 6F8: 4883EC20 SUB RSP, 32 ; 6FC: 4883E4F0 AND RSP, -16 ; 700: 488D4C240B LEA RCX, [RSP+11] ; 705: B835020000 MOV EAX, 565 ; 70A: 480B042508011020 OR RAX, [#x20100108] ; SB-VM:FUNCTION-LAYOUT ; 712: 488941F5 MOV [RCX-11], RAX ; 716: 488D0583000000 LEA RAX, [RIP+131] ; = #x10043F47A0 ; 71D: 488941FD MOV [RCX-3], RAX ; 721: 48896905 MOV [RCX+5], RBP ; 725: 488BD1 MOV RDX, RCX ; 728: 488D4424F0 LEA RAX, [RSP-16] ; 72D: 4883EC10 SUB RSP, 16 ; 731: 488B3D38FFFFFF MOV RDI, [RIP-200] ; '(1 2 3) ; 738: B904000000 MOV ECX, 4 ; 73D: 488928 MOV [RAX], RBP ; 740: 488BE8 MOV RBP, RAX ; 743: B838734F20 MOV EAX, #x204F7338 ; # ; 748: FFD0 CALL RAX ; 74A: 480F42E3 CMOVB RSP, RBX ; 74E: 488B5DE0 MOV RBX, [RBP-32] ; 752: 488BE3 MOV RSP, RBX ; 755: 488B55E8 MOV RDX, [RBP-24] ; 759: B902000000 MOV ECX, 2 ; 75E: FF7508 PUSH QWORD PTR [RBP+8] ; 761: B8B8733120 MOV EAX, #x203173B8 ; # ; 766: FFE0 JMP RAX NIL
Much nicer. However, this optimization comes with a price: safety.
gather closure should never be called outside of the
gathering block that defines it. As the docstring says: it would be useless to do so anyway, because the result has already been returned. But what would happen if the user accidentally calls the closure? Let’s try it out, first on the original version without the
(defparameter *f* nil) (defun leak (function) (setf *f* function)) (gathering (leak #'gather)) (funcall *f* 1) ; => 1
Here the closure is heap-allocated, so calling it later is fine (if useless). But what happens if we try our optimized version?
(gathering-dynamic-extent (leak #'gather)) (funcall *f*) ; CORRUPTION WARNING in SBCL pid 19575(tid 0x7fffb6814380): ; Memory fault at 0xffffffff849ff8e7 (pc=0x19ff8df, sp=0x19ff890) ; The integrity of this image is possibly compromised. ; Continuing with fingers crossed. ; ; debugger invoked on a SB-SYS:MEMORY-FAULT-ERROR in thread ; #: ; Unhandled memory fault at #xFFFFFFFF849FF8E7. ; ; restarts (invokable by number or by possibly-abbreviated name): ; 0: [ABORT] Exit debugger, returning to top level. ; ; (SB-SYS:MEMORY-FAULT-ERROR) ; 0]
Things get even worse if you’re brave (foolish) enough to be running with
(declaim (safety 0)) :
(gathering-dynamic-extent (leak #'gather)) (funcall *f*) ; debugger invoked on a SB-KERNEL:FLOATING-POINT-EXCEPTION in thread ; #: ; An arithmetic error SB-KERNEL:FLOATING-POINT-EXCEPTION was signalled. ; No traps are enabled? How can this be? ; ; ; restarts (invokable by number or by possibly-abbreviated name): ; 0: [ABORT] Exit debugger, returning to top level. ; ; ("bogus stack frame") ; 0]
The moral of this story is that although we can optimize for a little bit of speed, it comes at a price that might not be worth paying.
Here’s an exercise for you: make the original heap-allocated version signal an error (with a nice error message) when called outside of its
gathering block, instead of silently doing something useless.
Let’s finish up by looking at some places where I’ve found this macro to be handy. I won’t go into too much depth about the individual pieces of code, but feel free to ask if you have questions.
First up, an example from my Project Euler
(defun pandigitals (&optional (start 1) (end 9)) "Return a list of all `start` to `end` (inclusive) pandigital numbers." (gathering (map-permutations (lambda (digits) ;; 0-to-n pandigitals are annoying because we don't want ;; to include those with a 0 first. (unless (zerop (first digits)) (gather (digits-to-number digits)))) (irange start end) :copy nil)))
This is a prime example of where Alexandria’s
map-permutations not returning anything is annoying, but also shows how
gathering provides you a little more flexibility. Even if
map-permutations returned a list of results, we’d still need to filter out the ones that start with zero. With
gathering we can avoid collecting the unneeded results at all by simply not calling
gather on them.
Hash Table Contents
Next is an easy way to convert a hash table to lists of
(key value) :
(defun hash-table-contents (hash-table) "Return a fresh list of `(key value)` elements of `hash-table`." (gathering (maphash (compose #'gather #'list) hash-table)))
gather is a
flet ed function we can use it like any other function. Here we
compose it with
list and pass it off to
maphash (which, for some reason, doesn’t return its results).
We’ll end with a triangle-generation function from my procedural art bot :
(defun generate-universe-balancing (depth) (gathering (labels ((should-stop-p (iteration) (or (= depth iteration) (and (> iteration 6) (randomp (map-range 0 depth 0.0 0.05 iteration) #'rand)))) (recur (triangle &optional (iteration 0)) (if (should-stop-p iteration) (gather triangle) (map nil (rcurry #'recur (1+ iteration)) (split-triangle-self-balancing triangle))))) (map nil #'recur (initial-triangles)))))
This is used to generate the triangles for images like this:
gathering lets me write the recursive generation algorithm in a natural way, and just plug in a simple
(gather triangle) when we finally bottom out at the base case.