Chapter 11. More macros and DSLs

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 11. More macros and DSLs

This chapter covers

Anaphoric macros
Shifting computation to compile time
Macro-generating macros
Designing, writing, and optimizing domain-specific languages in Clojure

This final chapter is about what many consider the most powerful feature of Clojure. John McCarthy, the inventor of the Lisp programming language, once said that Lisp is a local maximum in the space of programming languages.^[1] Clojure macros make it possible to do arbitrary code transformations of Clojure code, using Clojure itself. No programming language outside the Lisp family can do this in such a simple way. This is possible because code is data.

¹
History of Lisp (paper presented at the first History of Programming Languages conference, June 1–3, 1978), http://www-formal.stanford.edu/jmc/history/lisp/lisp.html.

You’ve seen a lot of macros in the course of this book, including in chapter 7, which served as an introduction to the topic. In this section, you’re going to see a lot more but with two new points of focus: the advanced uses of macros and the conscious design of a simple domain-specific language. Mastering these topics will let you design elegant abstractions for even the most demanding of problem domains.

11.1. A quick review of macros

You’ve already used macros quite a bit, but as a refresher here you’ll write a little macro to remind you what macros make possible. You’ve used Clojure’s let macro several times so far. Although let itself is a macro, it’s implemented in terms of the let* special form, which sets up lexical scope for the symbols named in the binding form. You’ll now implement a subset of the functionality of let via a macro that generates function calls. This is what you’d like to do:

(my-let [x 10
         y x
         z (+ x y)]
  (* x y z))
;=> 2000

This should return 2000, because x is 10, y is also 10, and z is 20. Here’s the implementation:

(defmacro single-arg-fn [binding-form & body]
  `((fn [~(first binding-form)] ~@body) ~(second binding-form)))

(defmacro my-let [lettings & body]
  (if (empty? lettings)
    `(do ~@body)
    `(single-arg-fn ~(take 2 lettings)
       (my-let ~(drop 2 lettings) ~@body))))

In case you’d like to refresh your memory on how macros work, please refer back to chapter 7. Although the preceding code is a limited implementation, you still get all the advantages that arise from using functions underneath the covers. For instance, you can do the following:

(my-let [[a b] [2 5]
         {:keys [x y]} {:x (* a b) :y 20}
         z (+ x y)]
        (println "a,b,x,y,z:" a b x y z)
        (* x y z))
a,b,x,y,z: 2 5 10 20 30
;=> 6000

Notice here that all destructuring forms just work, because underneath the covers, regular functions are at work. Specifically, you’re taking each my-let binding and setting it up as the single argument to an anonymous unary function. You’re essentially converting the my-let form into a nested series of such unary functions. Try expanding the macro at the read-evaluate-print loop (REPL) to see the forms generated.

You’re not doing any error checking here, but hopefully this example has reminded you how macros work, as well as shown you how to seemingly add features to the Clojure language itself. Use macroexpand-1, macroexpand, and clojure.walk/macroexpand-all to get a hint as to how the my-let does its thing. We’re now ready to look beyond the basics.

In this section, we’re going to explore three new concepts. The first is that of anaphora, an approach of writing macros that utilize intentional variable capture to their advantage. You’ll see why they’re called anaphoric and how they can be used to add special syntax features to Clojure.

The second concept we’ll explore is the idea of moving some of the computation from a program’s runtime into its compile time. Some computation that would otherwise be done when the program is already running will now be done while the code is being compiled. You’ll see not only where this might be useful but also an example of precomputing decrypting tables.

Finally, we’ll look at writing macros that generate other macros. This can be tricky, and we’ll look at a simple example of such a macro. Understanding macro-generating macros is a sign of being on the path to macro Zen.

Without further ado, our first stop is Clojure anaphora.

11.2. Anaphoric macros

In chapter 7, we talked about the issue of variable capture. As a reminder, variable capture happens when a variable name in a macro expansion (say in a generated let binding) shadows something outside that immediate scope (say in an outer let binding). You saw that Clojure solves this issue in an elegant manner through two processes: the first is that names inside a macro template get namespace qualified to the namespace that the macro is defined in, and the second is by providing a convenient auto-gensym reader macro.

Macros that do their work based on intentional variable capture are called anaphoric macros (anaphor means a word or phrase that refers to an earlier word or phrase). In this section, we’ll do more variable capture but in a slightly more complex manner. To get things started, we’ll visit a commonly cited example that illustrates this concept. You’ll then build on it to write a useful utility macro.

11.2.1. The anaphoric if

Writing the anaphoric version of the if construct is the “Hello, world!” of anaphora. The anaphoric if is probably one of the simplest of its ilk, but it illustrates the point well, while also being a useful utility macro.

Consider the following example, where you first do a computation, check if it’s truthy, and then proceed to use it in another computation. Imagine that you had the following function:

(defn some-computation [x]
  (if (even? x) false (inc x)))

It’s a placeholder to illustrate the point we’re about to make. Now consider a use case as follows:

(if (some-computation 11)
  (* 2 (some-computation 11)))
;=> 24

Naturally, you wouldn’t stand for such duplication, and you’d use the let form to remove it:

(let [computation (some-computation 11)]
  (if computation
    (* 2 computation)))

You also know that you don’t need to stop here, because you can use the handy if-let macro:

(if-let [computation (some-computation 11)]
  (* 2 computation))

Although this is clear enough, it would be nice if you could write something like the following, for it to read more clearly:

(anaphoric-if (some-computation 11)
  (* 2 it))

Here, it is a symbol that represents the value of the condition clause. Most anaphoric macros use pronouns such as it to refer to some value that was computed.

Although the anaphoric style can produce very compact and easy-to-read code (that is, if you know what the anaphoric names are), Clojure idiom prefers to allow the user to provide a name to bind, such as in if-let. You should prefer this idiom, too, because such code is clearer and allows nested forms easily while being only slightly more verbose. But the anaphoric style is useful in DSLs, especially those designed for non-programmers.

Implementing anaphoric-if

Now that you’ve seen what you’d like to express in the code, let’s set about implementing it. You could imagine writing it as follows:

(defmacro anaphoric-if [test-form then-form]
  `(if-let [~'it ~test-form]
     ~then-form))

Here’s the macro expansion of the example from earlier:

(macroexpand-1 '(anaphoric-if (some-computation 11)
                  (* 2 it)))
;=> (clojure.core/if-let [it (some-computation 11)] (* 2 it))

That expansion looks exactly like what you need because it creates a local name it and binds the value of the test-form to it. It then evaluates the then-form inside the let block created by the if-let form, which ensures that it happens only if the value of it is truthy. Here it is in action:

(anaphoric-if (some-computation 12)
  (* 2 it))
;=> nil
(anaphoric-if (some-computation 11)
  (* 2 it))
;=> 24

Notice how you had to force Clojure not to namespace qualify the name it. You do this by unquoting a quoted symbol (that’s what the strange notation ~'it is). This forces the variable capture. You’ll use this technique (and the unquote splice version of it) again in the following sections.

Note

Remember that when you’re using anaphora, you’re using variable capture. So although it may be okay that the symbol it is captured in this example, that may not always be the case. Watch for situations where intentional variable capture can cause subtle bugs.

Now that you have an anaphoric version of if, you’ll write a macro that generalizes it a little.

Generalizing the anaphoric if

Recall the implementation of the anaphoric if macro:

(defmacro anaphoric-if [test-form then-form]
  `(if-let [~'it ~test-form]
     ~then-form))

Note that you built this on the if-let macro, which in turn is built on the if special form. If you were to remove the hard dependency on the if special form and instead specify it at call time, you could have a more general version of this code on your hands. Let’s take a look:

(defmacro with-it [operator test-form & exprs]
  `(let [~'it ~test-form]
     (~operator ~'it ~@exprs)))

So, you take the idea from anaphoric-if and create a new version of it where you need to pass in the thing you’re trying to accomplish. For instance, the example from before would now read like this:

(with-it if (some-computation 12)
  (* 2 it))
;=> nil
(with-it if (some-computation 11)
  (* 2 it))
;=> 24

Why would you want to do this? Because now you can have an anaphoric version of more than just the if form. For example, you could create anaphoric versions of and and when, as shown here:

(with-it and (some-computation 11) (> it 10) (* 2 it))
;=> 24

Or you could do this:

(with-it when (some-computation 11)
  (println "Got it:" it)
  (* 2 it))
Got it: 12
;=> 24

Try these out at the REPL, and also try versions that use if-not, or, when-not, and so on. You could even go back and define macros like anaphoric-if in terms of with-it, for instance:

(defmacro anaphoric-if [test-form then-form]
  `(with-it if ~test-form ~then-form))

You could define all such variants (using if, and, or, and so on) in one swoop.

This wraps up our introduction to anaphoric macros. As we mentioned at the start of this section, these examples are quite simple. The next one will be slightly more involved.

11.2.2. The thread-it macro

A couple of the most useful macros in Clojure’s core namespace are the threading macros—the thread-first macro (->) and the thread-last macro (->>), which we covered in chapter 2. As a refresher, you’ll write a function to calculate the surface area of a cylinder with a radius r and height h. The formula is

2 * PI * r * (r + h)

Using the thread-first macro, you can write this as

(defn surface-area-cylinder [r h]
  (-> r
    (+ h)
    (* 2 Math/PI r)))

You saw a similar example when you first encountered this macro. Instead of writing something like a let form with intermediate results of a larger computation, the first form is placed into the next form as the first argument, the resulting form is then placed into the next form as its first argument in turn, and so on. It’s a significant improvement in code readability.

The thread-last macro is the same, but instead of placing consecutive results in the first argument position of the following form, it places them in the position of the last argument. It’s useful in code that’s similar to the following hypothetical example:

(defn some-calculation [a-collection]
  (->> (seq a-collection)
       (filter some-pred?)
       (map a-transform)
       (reduce another-function)))

Threading in any position

Now, although both the thread-first and thread-last macros are extremely useful, they do have a possible shortcoming: they both fix the position of where each step of the computation is placed into the next form. The thread-first macro places it as the first argument of the next call, whereas the thread-last macro places it in the position of the last argument.

Occasionally, this can be limiting. Consider the previous code snippet. Imagine you wanted to use a function written by someone else called compute-averages-from that accepts two arguments: a sequence of data and a predicate, in that order. As it stands, you couldn’t plug that function into the threaded code shown previously, because the order of arguments was reversed. You’d have to adapt the function, perhaps as follows:

You’ve seen the use of anonymous functions to create adapter functions such as this before, but it isn’t pretty. It spoils the overall elegance by adding some noise to the code. What if, instead of being limited to threading forms as the first and last arguments of subsequent forms, you could choose where to thread them? Clojure 1.5 introduced the as-> threading macro (which you saw in chapter 2) to do just that. You could rewrite the preceding example as follows:

Implementing thread-it

Just like if-let, as-> requires that you supply a symbol for bindings. You created an anaphoric-if macro that always binds to the symbol it. Now you’re going to create a version of as->, called thread-it, which will always bind to the symbol it. With this new macro, you’ll be able to do something like this:

(defn yet-another-calculation [a-collection]
  (thread-it (seq a-collection)
             (filter some-pred? it)
             (map a-transform it)
             (compute-averages-from it another-pred?)))

Before we jump into the implementation, let’s add another change to the way Clojure’s built-in threading macros work, in that they expect at least one argument. You’d like to be able to call the thread-it macro without any arguments. This may be useful when you’re using it inside another macro. Although the following doesn’t work

(->> )
ArityException Wrong number of args (0) passed to: core/->>  clojure.lang.Compiler.macroexpand1 (Compiler.java:6557)

you’d like the macro to do this:

(thread-it)
;=> nil

Now we’re ready to look at the implementation. Consider the following:

(defmacro thread-it [& [first-expr & rest-expr]]
  (if (empty? rest-expr)
    first-expr
    `(let [~'it ~first-expr]
       (thread-it ~@rest-expr))))

As you can see, the macro accepts any number of arguments. The list of arguments is destructured into the first (named first-expr) and the rest (named rest-expr). The first task is to check to see if rest-expr is empty (which happens when either no arguments were passed in or a single argument was passed in). If this is so, the macro will return first-expr, which will be nil if there were no arguments passed into thread-it or the single argument if only one was passed in.

If there are arguments remaining inside rest-expr, the macro expands to another call to itself, with the symbol it bound to the value of first-expr, nestled inside a let block. This recursive macro definition expands until it has consumed all the forms it was passed in. Here’s an example of it in action:

(thread-it (* 10 20) (inc it) (- it 8) (* 10 it) (/ it 5))
;=> 386

Also, with the way it’s implemented, the following behavior is expected:

(thread-it it)
CompilerException java.lang.RuntimeException: Unable to resolve symbol: it in this context, compiling:(NO_SOURCE_PATH:1:1)

This happens because you don’t start by binding anything to it. You could change this behavior by initially binding it to a default value of some kind. That’s all there is to the implementation. It can be a useful macro in situations where the functions (or macros) in a threading form take arguments in an irregular order. Further, as a refinement, or perhaps as another version of this macro, you could replace the let with an if-let. This will short-circuit the computation if any step results in a logically false value like the some-> and some->> macros.

This leads us to the end of the discussion on anaphora. It’s a useful technique at times, even though it breaks hygiene because it involves variable capture. As mentioned, you have to be careful while using it, but when you do, it can result in code that’s more readable than it would be otherwise.

11.3. Shifting computation to compile time

Our next stop is to examine another use case of macros. You’re going to make the Clojure compiler work harder by doing some work that would otherwise have to be done by your program at runtime.

11.3.1. Example: Rotation ciphers without macros

So far in this book, you’ve seen several uses of macros and have written several macros yourself. In this section, you’re going to see another use of macros, and it has to do with performance. To illustrate the concept, we’ll examine a simple code cipher called ROT13. It stands for “rotate by 13 places” and is a simple cipher that can be broken quite easily. But its purpose is to hide text in a way that isn’t immediately obvious, so as not to communicate spy secrets. It’s commonly used as the online equivalent of text printed upside down (for example, in magazines and newspapers), to give out puzzle solutions, answers to riddles, and the like.

About the ROT13 cipher

Table 11.1 shows what each letter of the alphabet corresponds to.

Table 11.1. Alphabet rotated by 13 places

1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26
a	b	c	d	e	f	g	h	i	j	k	l	m	n	o	p	q	r	s	t	u	v	w	x	y	z
n	o	p	q	r	s	t	u	v	w	x	y	z	a	b	c	d	e	f	g	h	i	j	k	l	m

The first row is the index for each letter of the alphabet, starting at 1. The second row is the alphabet itself. The last row is the alphabet shifted by 13 places. Each letter on this last row corresponds to the letter that will be used in place of the letter above it in a message encrypted using this cipher system. For example, the word abracadabra becomes noenpnqnoen.

Decrypting a rotation cipher is usually done by rotating each letter back the same number of times. ROT13 has the additional property of being a reciprocal cipher. A message encrypted using a reciprocal cipher can be decrypted by running it through the cipher system itself. The encryption process also works to decrypt encrypted messages. In this section, you’ll implement a generalized rotation cipher by allowing the rotation length to be passed in as a parameter.

Generalized rotation ciphers

Let’s begin the implementation with the letters of the alphabet. Recall that Clojure has a convenient reader macro to represent literal characters:

(def ALPHABETS [a  c d e f g h i j k l m 
 o p q 
 s 	 u v w x y z])

Let’s also define a few convenience values based on the alphabet shown:

(def NUM-ALPHABETS (count ALPHABETS))
(def INDICES (range 1 (inc NUM-ALPHABETS)))
(def lookup (zipmap INDICES ALPHABETS))

Now, let’s talk about your approach. Because you want to implement a generic rotation mechanism, you’ll need to know at which numbered slot a letter falls when it’s rotated a specific number of times. You’d like to take a slot number such as 14, rotate it by a configurable number, and see where it ends up. For example, in the case of ROT13, the letter in slot 10 (which is the letter j) ends up in slot 23. You’ll write a function called shift, which will compute this new slot number. You can’t add the shift-by number to the slot number because you’ll have to take care of overflow. Here’s the implementation of shift:

(defn shift [shift-by index]
  (let [shifted (+ (mod shift-by NUM-ALPHABETS) index)]
    (cond
      (<= shifted 0) (+ shifted NUM-ALPHABETS)
      (> shifted NUM-ALPHABETS) (- shifted NUM-ALPHABETS)
      :default shifted)))

There are a couple of points to note here. The first is that you calculated shifted by adding (mod shift-by NUM-ALPHABETS) to the given index (and not shift-by) so that you can handle the cases where shift-by is more than NUM-ALPHABETS. Because you handle overflow by wrapping to the beginning, this approach works, for example:

(shift 10 13)
;=> 23
(shift 20 13)
;=> 7

Now that you have this function, you can use it to create a simple cryptographic tableau, a table of rows and columns with which you can decrypt or encrypt information. In this case, for ROT13, the tableau would be the second and third rows from table 15.1. Here’s a function that computes this:

(defn shifted-tableau [shift-by]
  (->> (map #(shift shift-by %) INDICES)
       (map lookup)
       (zipmap ALPHABETS)))

This creates a map where the keys are alphabets that need to be encrypted, and values are the cipher versions of the same. Here’s an example:

(shifted-tableau 13)
;=> {a 
,  o, c p, d q, e 
, f s, g 	, h u, i v, j w, k
     x, l y, m z, 
 a, o , p c, q d, 
 e, s f, 	 g, u
     h, v i, w j, x k, y l, z m}

Because this cipher is quite simple, a simple map such as this suffices. Now that you have your tableau, encrypting messages is as simple as looking up each letter. Here’s the encrypt function:

(defn encrypt [shift-by plaintext]
  (let [shifted (shifted-tableau shift-by)]
    (apply str (map shifted plaintext))))

Try it at the REPL:

(encrypt 13 "abracadabra")
;=> "noenpnqnoen"

That works as expected. Recall that ROT13 is a reciprocal cipher. Check to see if it works:

(encrypt 13 "noenpnqnoen")
;=> "abracadabra"

It does! If you rotate by anything other than 13, you’ll need a real decrypt function. All you need to do to decrypt a message is to reverse the process. You’ll express that as follows:

(defn decrypt [shift-by encrypted]
  (encrypt (- shift-by) encrypted))

decrypt works by rotating an encrypted message the other way by the same rotation. This shows how it works at the REPL:

(decrypt 13 "noenpnqnoen")
;=> "abracadabra"

Great, so you have all the bare necessities in place. To implement a particular cipher, such as ROT13, you can define a pair of functions as follows:

(def encrypt-with-rot13 (partial encrypt 13))
(def decrypt-with-rot13 (partial decrypt 13))

Now try it at the REPL:

(decrypt-with-rot13 (encrypt-with-rot13 "abracadabra"))
;=> "abracadabra"

So there you have it; you’ve implemented the simple cipher system. The complete code is shown in the following listing.

Listing 11.1. A general rotation cipher system to implement things like ROT13

(ns clj-in-act.ch11.shifting)
(def ALPHABETS [a  c d e f g h i j k l m 
 o p q 
 s 	 u v w x y z])
(def NUM-ALPHABETS (count ALPHABETS))
(def INDICES (range 1 (inc NUM-ALPHABETS)))
(def lookup (zipmap INDICES ALPHABETS))
(defn shift [shift-by index]
  (let [shifted (+ (mod shift-by NUM-ALPHABETS) index)]
    (cond
      (<= shifted 0) (+ shifted NUM-ALPHABETS)
      (> shifted NUM-ALPHABETS) (- shifted NUM-ALPHABETS)
      :default shifted)))
(defn shifted-tableau [shift-by]
  (->> (map #(shift shift-by %) INDICES)
       (map lookup)
       (zipmap ALPHABETS )))
(defn encrypt [shift-by plaintext]
  (let [shifted (shifted-tableau shift-by)]
    (apply str (map shifted plaintext))))
(defn decrypt [shift-by encrypted]
  (encrypt (- shift-by) encrypted))
(def encrypt-with-rot13 (partial encrypt 13))
(def decrypt-with-rot13 (partial decrypt 13))

The issue with this implementation is that you compute the tableau each time you encrypt or decrypt a message. This is easily fixed by memoizing the shifted-tableau function. This will take care of this problem, but in the next section, you’ll go one step further.

11.3.2. Making the compiler work harder

So far, you’ve implemented functions to encrypt and decrypt messages for any rotation cipher. Your basic approach has been to create a map that can help you code (or decode) each letter in a message to its cipher version. As discussed at the end of the previous section, you can speed up your implementation by memoizing the tableau calculation.

Even with memoize, the computation still happens at least once (the first time the function is called). Imagine, instead, if you created an inline literal map containing the appropriate tableau data. You could then look it up in the map each time, without having to compute it. Such a definition of encrypt-with-rot13 might look like this:

(defn encrypt-with-rot13 [plaintext]
  (apply str (map {a 
  o c p} plaintext)))

In an implementation, the tableau would be complete for all the letters of the alphabet, not only for a, , and c. In any case, if you did have such a literal map in the code itself, it would obviate the need to compute it at runtime. Luckily, you’re coding in Clojure, and you can bend it to your will. Consider the following:

(defmacro def-rot-encrypter [name shift-by]
  (let [tableau (shifted-tableau shift-by)]
    `(defn ~name [~'message]
       (apply str (map ~tableau ~'message)))))

This macro first computes the tableau for shifted-by as needed and then defines a function by the specified name. The function body includes the computed table, in the right place, as illustrated in the preceding code sample. Look at its expansion:

(macroexpand-1 '(def-rot-encrypter encrypt13 13))

;=> (clojure.core/defn encrypt13 [message] (clojure.core/apply clojure.core/
     str (clojure.core/map {a 
,  o, c p, d q, e 
, f s, g 	,
     h u, i v, j w, k x, l y, m z, 
 a, o , p c, q d, 

     e, s f, 	 g, u h, v i, w j, x k, y l, z m} message)))

This looks almost exactly like the desired function, with an inline literal tableau map. Figure 11.1 shows the flow of the code.

Figure 11.1. As usual, the Clojure reader first converts the text of your programs into data structures. During this process, macros are expanded, including the `def-rot-encrypter` macro, which generates a tableau. This tableau is a Clojure map and is included in the final form of the source code as an inline lookup table.

Now check to see if it works:

(def-rot-encrypter encrypt13 13)
;=> #'user/encrypt13
(encrypt13 "abracadabra")
;=> "noenpnqnoen"

And there you have it. The new encrypt13 function at runtime doesn’t do any tableau computation at all. If, for instance, you were to ship this code off to users as a Java library, they wouldn’t even know that shifted-tableau was ever called.

As a final item, you’ll create a convenient way to define a pair of functions that can be used to encrypt or decrypt functions in a rotation cipher:

(defmacro define-rot-encryption [shift-by]
  `(do
     (def-rot-encrypter ~(symbol (str "encrypt" shift-by)) ~shift-by)
     (def-rot-encrypter ~(symbol (str "decrypt" shift-by)) ~(- shift-by))))

And finally, here it is in action:

(define-rot-encryption 15)
;=> #'user/decrypt15

Here, it prints the decrypt function var, because it was the last thing the macro expansion did. Now use the new pair of functions:

(encrypt15 "abracadabra")
;=> "pqgprpspqgp"
(decrypt15 "pqgprpspqgp")
;=> "abracadabra"

Shifting computation to the compile cycle can be a useful trick when parts of the computation needed are known in advance. Clojure macros make it easy to run arbitrary code during the expansion phase and give the programmer the power of the full Clojure language itself. In this example, for instance, you wrote the shifted-tableau function with no prior intention of using it in this manner. Moving computation into macros this way can be quite handy at times, despite how simple it is to do.

11.4. Macro-generating macros

Now that you understand what it is to move computation to the compile phase of program execution, you’re ready for a new adventure. You’ll expand your mind a little as you try to write code that writes code that writes code—that is, you’re going to write a macro that writes a macro.

Let’s look at an example of a macro that will create a synonym for an existing function or macro. Imagine you have two vars as follows:

(declare x y)
;=> #'user/y

And if you use the new macro make-synonym

(make-synonym b binding)
;=> #'user/b

then the following should work:

(b [x 10 y 20] [x y])
;=> [10 20]

You’ll implement the make-synonym macro in this section.

11.4.1. Example template

When writing a macro, it’s usually easier to start with an example of the desired expansion. Here’s what you want to write:

(b [x 10 y 20] (println "X,Y:" x y))

And for it to work, b should be replaced with binding, resulting in the expansion

(binding [x 10 y 20] (println "X,Y:" x y))

You could easily solve this if you wrote a custom macro defining b in terms of binding, as follows:

(defmacro b [& stuff]
  `(binding ~@stuff))

This replaces the symbol b with the symbol binding, keeping everything else the same. You aren’t interested in the vars being bound, or the body itself, which is why you lump everything into stuff.

Now that you have a version of b that works as expected, you need to generalize it into make-synonym. The previous code is an example of what the make-synonym macro ought to produce.

11.4.2. Implementing make-synonym

You know that make-synonym is a macro and that it accepts two parameters:

A new symbol that will be the synonym of the existing macro or function
The name of the existing macro or function

You can begin implementing the new macro by starting with an empty definition:

(defmacro synonym [new-name old-name])

The next question is, what should go in the body? You can start by putting in the sample expansion from the previous section. Here’s what it looks like:

(defmacro make-synonym [new-name old-name]
  (defmacro b [& stuff]
    `(binding ~@stuff)))

Obviously, this won’t work as desired, because no matter what’s passed in as arguments to this version of make-synonym, it will always create a macro named b (that expands to binding).

What you want, instead, is for make-synonym to produce the inner form containing the call to defmacro, instead of calling it. You know you can do this using the backquote. In this case, you’ll have two backquotes. While you’re at it, instead of the hardcoded symbols b and binding, you’ll use the names passed in as parameters. Consider the following increment of the make-synonym macro:

(defmacro make-synonym [new-name old-name]
  `(defmacro ~new-name [& stuff]
     `(~old-name ~@stuff)))

This is a little confusing, because you have two backquotes in play here, one nested inside the other. The easiest way to understand what’s happening is to look at an expansion. So try it at the REPL:

(macroexpand-1 '(make-synonym b binding))
;=> (clojure.core/defmacro b [& user/stuff]
      (clojure.core/seq
        (clojure.core/concat (clojure.core/list user/old-name)
                             user/stuff)))

To understand this expansion, let’s first look at what happens to a backquote when it’s expanded:

(defmacro back-quote-test []
  `(something))
;=> #'user/back-quote-test
(macroexpand '(back-quote-test))
;=> (user/something)

This isn’t surprising, because the Clojure namespace qualifies any names unless explicitly asked not to. Now, add a backquote:

(defmacro back-quote-test []
  ``(something))
;=> #'user/back-quote-test

You’ve added another backquote to the one already present. What you’re saying is that instead of expanding the backquoted form and using its return value as the expansion of the back-quote-test macro, you want the backquoting mechanism itself. Again, to refresh your memory of how backquotes work, refer to chapter 7. Here it is at the REPL:

(macroexpand '(back-quote-test))
;=> (clojure.core/seq
    (clojure.core/concat (clojure.core/list (quote user/something))))

Because you’re using the symbol something as is, Clojure is namespace qualifying, as you’d expect. Now that you know what the backquote mechanism itself is, you can return to the expansion of make-synonym:

(macroexpand-1 '(make-synonym b binding))
;=> (clojure.core/defmacro b [& user/stuff]
      (clojure.core/seq
        (clojure.core/concat clojure.core/list user/old-name) user/stuff)))

Here, the symbol b gets substituted as part of the expansion of the outer backquote expansion. Because you don’t explicitly quote the symbol stuff, it gets namespace qualified (you’ll need to fix that soon). To understand what’s happening to old-name inside the nested backquote, look at the following:

(defmacro back-quote-test []
  ``(~something))
;=> #'user/back-quote-test
(macroexpand '(back-quote-test))
;=> (clojure.core/seq (clojure.core/concat (clojure.core/list user/something)))

If you compare this to the previous version of back-quote-test and the expansion it generated, you’ll notice that user/something is no longer wrapped in a quote form. This is again as expected, because you’re unquoting it using the ~ reader macro. This explains why the nested backquote form of the make-synonym macro expands with user/old-name as it does. Again, you’ll need to fix this problem because you don’t want the symbol old-name but the argument passed in.

Finally, to see what’s going on with the unquote splicing and the stuff symbol, look at the following simpler example:

(defmacro back-quote-test []
  ``(~@something))
;=> #'user/back-quote-test
(macroexpand '(back-quote-test))
;=> (clojure.core/seq (clojure.core/concat user/something))

If you now compare this version of the expansion with the previous one, you’ll note that user/something is no longer wrapped in a call to list. This is in line with the expected behavior of unquote splicing in that it doesn’t add an extra set of parentheses.

At this point, we’ve walked through the complete expansion of the make-synonym macro. The only problem is that it still doesn’t do what you intended it to do. The two problems identified were that both stuff and old-name weren’t being expanded correctly. You’ll fix stuff first. Consider the following change to make-synonym:

(defmacro make-synonym [new-name old-name]
  `(defmacro ~new-name [& ~'stuff]
     `(~old-name ~@~'stuff)))

Here’s the expansion:

(macroexpand-1 '(make-synonym b binding))
;=> (clojure.core/defmacro b [& stuff]
      (clojure.core/seq (clojure.core/concat
                          (clojure.core/list user/old-name) stuff)))

Finally, you’ll fix the issue with user/old-name:

(defmacro make-synonym [new-name old-name]
  `(defmacro ~new-name [& ~'stuff]
     `(~'~old-name ~@~'stuff)))

And here’s the expansion:

(macroexpand-1 '(make-synonym b binding))
;=> (clojure.core/defmacro b [& stuff]
      (clojure.core/seq (clojure.core/concat
        (clojure.core/list (quote binding)) stuff)))

Notice the odd ~'~old-name quoting and unquoting. This evaluates like so: first, ~old-name is expanded, leaving ~'binding (the value of old-name) for the generated macro. Then the outer backquote is expanded, leaving you with 'binding, which finally becomes (quote binding). You have to do this to ensure that the value of old-name isn’t resolved until the generated macro expands.

To check to see if this is what you expect, compare it with your original template:

(defmacro b [& stuff]
  `(binding ~@stuff))

This is indeed what you set out to do, and you can test it as follows:

(declare x y)
;=> #'user/y
(make-synonym b binding)
;=> #'user/b
(b [x 10 y 20] [x y])
;=> [10 20]

Phew, you’re finished. That was a lot of calisthenics for three lines of code. We’ll wrap up this section with why you even bothered with this somewhat esoteric code.

11.4.3. Why macro-generating macros

There are at least two reasons why it’s useful to know how to write macros that generate macros. The first is the same reason you’d write any other kind of macro: to create abstractions that remove the duplication that arises from patterns in the code. This is important when these duplications are structural and are difficult to eliminate without some form of code generation. Clojure macros are an excellent tool to do this job, because they give the programmer the full power of Clojure to do it. The fact that code generation is a language-level feature does pull its weight.

Having said this, although writing macros is a common thing to do in a Clojure program, it isn’t often the case that a macro generates another macro. You’ll probably do it only a handful of times in your career. Combined with the other uses you’ve seen, such as moving computation to compile time and intentional symbol capture—the few times when you do need macros to abstract patterns out of macros themselves—writing macros to generate macros can lead to a solution that would be difficult without the technique.

The second reason, and the more commonly useful one, for knowing this concept is to drive home the process of macro expansion, quoting, and unquoting. If you can understand and write macros that generate macros, then you’ll have no trouble writing simpler ones.

With these topics about macro writing out of the way, we’re ready to move on to a couple of examples. In the next section, we’ll look at using macros to create domain-specific languages (DSLs).

11.5. Domain-specific languages

We’re now going to look at explicitly doing something you’ve been doing implicitly so far. In several chapters, you’ve written macros that appear to add features to the Clojure language itself. For example,

In chapter 8, you created a simple object system with most of the semantics of regular object-oriented languages.
In chapter 9, you created def-modus-operandi, which allowed multimethods to be used in a manner similar to Clojure protocols.

These are just a couple of examples of macros helping you present your abstractions as a convenient feature of the language.

In this section, we’re going to further explore the idea of wrapping your abstractions in a layer of language. Taking this idea to its logical end brings us to the concept of metalinguistic abstraction—the approach of creating a domain-specific language that’s then used to solve the problem at hand. It allows you to solve not only the problem you started out with but a whole class of problems in that domain. It leaves you with a system that’s highly flexible and maintainable, while staying small and easier to understand and debug. Let’s begin by examining the design philosophy that leads to such systems.

11.5.1. DSL-driven design

To design a DSL you must consider two factors: how can the DSL decompose its problem domain into its various parts (the language’s “vocabulary”), and how will it facilitate composing those parts back together in expressive ways (its “grammar”)?

Design consideration #1: Decomposition

When given the requirements of a software program, the first step in creating a program to satisfy them usually involves thinking about what approach to take. This might end with a big design session that produces a detailed breakdown of the various components and pieces that will compose the final solution.

Top-down decomposition

This often goes hand in hand with the traditional top-down decomposition technique of taking something large and complex and breaking it into pieces that are smaller, independent, and easier to understand.

By itself, this approach has been known to not work particularly well in most cases. This is because the requirements for most systems are never specified perfectly, which causes the system to be redesigned in ways big and small. Many times, the requirements explicitly change over time as the reality of the business itself changes. This is why most agile teams prefer an evolutionary design, one that arises from incrementally building the system to satisfy more and more of the requirements over time.

When such an approach is desirable (and few systems can do without it these days), it makes sense to think not only in a top-down manner but also in a bottom-up way.

Bottom-up decomposition

Decomposing a problem in a bottom-up manner is different from the top-down version. With the bottom-up approach, you create small abstractions on top of the core programming language to handle tiny elements of the problem domain. These domain-specific primitives are created without explicit thought to exactly how they’ll eventually be used to solve the original problem. Indeed, at this stage, the idea is to create primitives that model all the low-level details of the problem domain.

Design consideration #2: Combinability

The other area of focus is combinability. The various domain primitives should be combinable into more complex entities as desired. This can be done using either the combinability features of the programming language itself (for instance, Clojure’s functions) or by creating new domain-specific constructs on top of existing ones. Macros can help with such extensions, because they can manipulate code forms with ease.

Functional programming aids in the pursuit of such a design. In addition to recursive and conditional constructs, being able to treat functions as first-class objects allows higher levels of complexity and abstraction to be managed in a more natural manner. Being able to create lexical closures adds another powerful piece to your toolset. When higher-order functions, closures, and macros are used together, the domain primitives can be combined to solve more than the original problem specified in the requirements document. It can solve a whole class of problems in that domain, because what gets created at the end of such a bottom-up process is a rich set of primitives, operators, and forms for combination that closely models the business domain itself.

The final layers of such a system consist of two pieces. The topmost is literally the respecification of the requirements in an executable DSL. This is metalinguistic abstraction, manifested in the fact that the final piece of the system that seems to solve the problem is written not in a general-purpose programming language but in a language that has been grown organically from a lower-level programming language. It’s often understandable by nonprogrammers and indeed is sometimes suitable for them to use directly.

The next piece is a sort of runtime adapter, which executes the domain-specific either by interpreting it or by compiling it down to the language’s own primitives. An example may be a set of macros that translate the syntactically friendly code into other forms and code that sets up the right evaluation context for it. Figure 11.2 shows a block diagram of the various layers described.

Figure 11.2. The typical layers in a DSL-driven system. Such systems benefit from a bottom-up design where the lowest levels are the primitive concepts of the domain modeled on top of the basic Clojure language. Higher layers are compositions of these primitives into more complex domain concepts. Finally, a runtime layer sits on top of these, which can execute code specified in a DSL. This final layer often represents the core solution of the problem that the software was meant to solve.

It’s useful to point out that a domain-specific language isn’t about using macros, even though they’re often a big part of the final linguistics. Macros help with fluency of the language, especially as used by the end users but also at lower levels to help create the abstractions themselves. In this way, they’re no different from other available features of the language such as higher-order functions and conditionals. The point to remember is that the core of the DSL approach is the resulting bottom-up design and the set of easily combinable domain primitives.

In the next section, we’ll explore the creation of a simple DSL.

11.5.2. User classification

Most websites today personalize the experience for individual users by using users’ own data to improve their experience. Amazon, for example, shows users things they might like to buy based on their purchase history and browsing patterns. Other web services use similarly collected use statistics to show more relevant ads to users as they browse. In this section, we’ll explore this business domain.

The goal here is to use data about the user to do something special for them. It could be showing ads or making the site more specific to the user’s tastes. The first step in any such task is classifying the user. Usually, the system can recognize several classes of users and is able to personalize the experience for each class in some way. The business folks would like to be able to change the specification of the various segments as they’re discovered, so the system shouldn’t hardcode this aspect. Further, they’d like to make such changes quickly, potentially without requiring development effort and without requiring a restart of the system after making such changes. In an ideal world, they’d even like to specify the segment descriptions in a nice little GUI application.

This example is well suited to our earlier discussion, but aspects of this apply to most nontrivial systems being built today. For this example, you’ll build a DSL to specify the rules that classify users into various segments. To get started, you’ll describe the lay of the land, which in this case will be a small part of the overall system design, as well as a few functions available to find information about your users.

Data element

You’ll model a few primitive domain-specific data elements, focusing on things that can be gleaned from the data that users’ browsers send to the server along with every request. There’s nothing to stop you from extending this approach to things that are looked up from elsewhere, such as a database of users’ past behaviors, or indeed anything else, such as stock quotes or the weather in Hawaii. You’ll model the session data as a simple Clojure map containing the data elements you care about, and you’ll store it in Redis. You don’t need to focus on how you create the session map, because this example isn’t about parsing strings or loading data from various data stores.

Here’s an example of a user session:

{:consumer-id  "abc"
 :url-referrer "http://www.google.com/search?q=clojure+programmers"
 :search-terms ["clojure" "programmers"]
 :ip-address   "192.168.0.10"
 :tz-offset    420
 :user-agent   :safari}

Again, sessions can contain a lot more than what comes in via the web request. You can imagine loads of precomputed information being stored in such a session to enable more useful targeting as well as a caching technique so that things don’t have to be loaded or computed more than once in a user’s session.

User session persistence

You’ll need a key to store such sessions in Redis,^[2] and for this example :consumer-id will serve you well. You’ll add a level of indirection so the code will read better as well as let you change this decision later if you desire:

²
See http://redis.io. In this chapter we follow the redis-clojure library’s API by Ragnar Dahlén (https://github.com/ragnard/redis-clojure), but you should use the newer and better Carmine library for your own projects (https://github.com/ptaoussanis/carmine). The code package with this book contains a redis namespace that mocks enough of the redis-clojure library for you to run the code in this chapter without running a Redis server or using a Redis library.

(def redis-key-for :consumer-id)

First, you’ll define a way to save sessions into Redis and also to load them back out. Here’s a pair of functions that do that:

(defn save-session [session]
  (redis/set (redis-key-for session) (pr-str session)))
(defn find-session [consumer-id]
  (read-string (redis/get consumer-id)))

Now that you have the essential capability of storing and loading sessions, you have a design decision to make. If you consider the user session to be the central concept in your behavioral targeting domain, then you can write it such that the DSL always executes in context of a session. You could define a var called *session* that you’ll then bind to the specific one during a computation:

(def ^:dynamic *session*)

And you could define a convenience macro that sets up the binding:

(defmacro in-session [consumer-id & body]
  `(binding [*session* (find-session ~consumer-id)]
     (do ~@body)))

The following listing shows the complete session namespace defined so far.

Listing 11.2. Basic functions to handle session persistence in Redis

(ns clj-in-act.ch11.session
  (:require redis))
(def redis-key-for :consumer-id)
(def ^:dynamic *session*)
(defn save-session [session]
  (redis/set (redis-key-for session) (pr-str session)))
(defn find-session [consumer-id]
  (read-string (redis/get consumer-id)))
(defmacro in-session [consumer-id & body]
  `(binding [*session* (find-session ~consumer-id)]
     (do ~@body)))

Now that you’ve dealt with persisting user sessions, next you’ll focus on the segmentation itself.

Segmenting users

In your application, you’d like to satisfy two qualitative requirements of this segmentation process. First, these rules shouldn’t be hardcoded into your application; it should be possible to dynamically update the rules. Second, these rules should be expressed in a format that’s somewhat analyst friendly. That is, the rules should be in a DSL that’s somewhat simpler for nonprogrammers to express ideas in. Here’s an example of something you might allow:

(defsegment googling-clojurians
     (and
       (> (count $search-terms) 0)
       (matches? $url-referrer "google")))

Here’s another example of the desired language:

(defsegment loyal-safari
     (and
       (empty? $url-referrer)
       (= :safari $user-agent)))

Notice the symbols prefixed with $. These are meant to have special significance in your DSL, because they’re the elements that will be looked up and substituted from the user’s session. Your job now is to implement def-segment so that the previous definition is compiled into something meaningful.

Syntax of Clojure DSLs

In many programming languages, especially dynamic ones such as Ruby and Python, domain-specific languages have become all the rage. There are two kinds of DSLs: internal and external. Internal DSLs are hosted on top of a language such as Ruby and use the underlying language to execute the DSL code. External DSLs are limited forms of regular programming languages in the sense that they have a lexer and parser that convert DSL code that conforms to a grammar into executable code. Internal DSLs are often simpler and serve most requirements that a DSL might need to satisfy.

Such DSLs are often focused on providing English-like readability, and a lot of text-parsing code is dedicated to converting the easy-to-read text into constructs of the underlying language. Clojure, on the other hand, has its magical reader. It can read an entire character stream and convert it into a form that can be executed. The programmer doesn’t have to do anything to support the lexical analysis, tokenizing, and parsing. Clojure even provides a macro system to further enhance the capabilities of textual expression.

This is the reason why many Clojure DSLs look much like Clojure. Clojure DSLs are often based on s-expressions because using the reader to do the heavy lifting of creating a little language is the most straightforward thing to do. The book DSLs in Action by Debasish Ghosh (Manning Publications, 2010) is a great resource if you’re interested in DSLs in a variety of languages.

You can start with a macro skeleton that looks like this:

(defmacro defsegment [segment-name & body])

You’ll begin by handling the $ prefixes. You’ll transform the body expressions such that all symbols prefixed by the $ will be transformed into a session lookup of an attribute with the same name. Something like $user-agent will become (:user-agent *sessions*). To perform this transformation, you’ll need to recursively walk the body expression to find all the symbols that need this substitution and then rebuild a new expression with the substitutions made. Luckily, you don’t have to write this code because it exists in the clojure.walk namespace. This namespace contains several functions that are useful when walking through Clojure code data structures. The postwalk function fits the bill:

(doc postwalk)
-------------------------
clojure.walk/postwalk
([f form])
  Performs a depth-first, post-order traversal of form.  Calls f on
  each sub-form, uses f's return value in place of the original.
  Recognizes all Clojure data structures except sorted-map-by.
  Consumes seqs as with doall.
;=> nil

This is what you need, so you can transform your DSL code using the following function:

(defn transform-lookups [dollar-attribute]
  (let [prefixed-string (str dollar-attribute)]
    (if-not (.startsWith prefixed-string "$")
      dollar-attribute
      (session-lookup prefixed-string))))

You’ll need a couple of support functions, namely, session-lookup and drop-first-char, which can be implemented as follows:

(defn drop-first-char [name]
  (apply str (rest name)))
(defn session-lookup [dollar-name]
  (->> (drop-first-char dollar-name)
       (keyword)
       (list '*session*)))

Now test that the code you wrote does what’s expected:

(transform-lookups '$user-agent)
;=> (*session* :user-agent)

This is a simple test, but note that the resulting form can be used to look up attributes of a user session if the *session* special var is bound appropriately.

Now, use postwalk to test your replacement logic on a slightly more complex form:

(postwalk transform-lookups '(> (count $search-terms) 0))
;=> (> (count (*session* :search-terms)) 0)

That works as expected. You now have a tool to transform the DSL body expressed using the $-prefixed symbols into usable Clojure code. As an aside, you also have a place where you can make more complex replacements if you need to.

You can now use this in the definition of defsegment as follows:

(defmacro defsegment [segment-name & body]
  (let [transformed (postwalk transform-lookups body)])

You’ve now transformed the body as specified by the user of your DSL, and you now need to convert it into something you can execute later. Let’s look at what you’re working with:

(postwalk transform-lookups '(and
                               (> (count $search-terms) 0)
                               (= :safari $user-agent)))
;=> (and
      (> (count (*session* :search-terms)) 0)
      (= :safari (*session* :user-agent)))

The simplest way to execute this later is to convert it into a function. You can then call the function whenever you need to run this rule. You used a similar approach when you defined the remote worker framework, where you stored computations as anonymous functions that were executed on remote servers. If you’re going to do this, you’ll need a place to put the functions. You’ll create a new namespace to keep all code related to this storing of functions for later use. It’s shown in the following listing.

Listing 11.3. `dsl-store` namespace for storing the rules as anonymous functions

(ns clj-in-act.ch11.dsl-store)
(def RULES (ref {}))
(defn register-segment [segment-name segment-fn]
  (dosync
   (alter RULES assoc-in [:segments segment-name] segment-fn)))
(defn segment-named [segment-name]
  (get-in @RULES [:segments segment-name]))
(defn all-segments []
  (:segments @RULES))

Now that you know you can put functions where you can find them again later, you’re ready to improve the definition of defsegment:

(defmacro defsegment [segment-name & body]
  (let [transformed (postwalk transform-lookups body)]
    `(let [segment-fn# (fn [] ~@transformed)]
       (register-segment ~(keyword segment-name) segment-fn#))))

You now have all the pieces together for your DSL to compile. The following listing shows the complete segment namespace.

Listing 11.4. Segmentation DSL defined using a simple macro

(ns clj-in-act.ch11.segment
  (:use clj-in-act.ch11.dsl-store
        clojure.walk))
(defn drop-first-char [name]
  (apply str (rest name)))
(defn session-lookup [dollar-name]
  (->> (drop-first-char dollar-name)
       (keyword)
       (list '*session*)))
(defn transform-lookups [dollar-attribute]
  (let [prefixed-string (str dollar-attribute)]
    (if-not (.startsWith prefixed-string "$")
      dollar-attribute
      (session-lookup prefixed-string))))
(defmacro defsegment [segment-name & body]
  (let [transformed (postwalk transform-lookups body)]
    `(let [segment-fn# (fn [] ~@transformed)]
       (register-segment ~(keyword segment-name) segment-fn#))))

Here it is in action, at the REPL:

(defsegment loyal-safari
  (and
    (empty? $url-referrer)
    (= :safari $user-agent)))
;=> {:segments
      {:loyal-safari
        #<user$eval3457$segment_fn__3232__auto____3458 user$eval3457$segment_fn__3232__auto____3458@5054c2b8>}}

The definition of googling-clojurians still won’t work, because it will complain about an unknown matches? function. You’re going to solve this and add more functionality in the next couple sections.

Fine-tuning: The power of the DSL

So far, you’ve put together the plumbing of the DSL. You can define some DSL code and expect it to compile and some functions to be created and stored as a result. At least three things influence how powerful your DSL can be.

The first is the data inside a user’s session. Entities such as $url-referrer and $search-terms are examples of this. These data elements are obtained either directly from the web session of the user, from historical data about the user, or from any other source that has been used to load information into the user’s session.

The second factor is the number of primitives that can be used to manipulate the data elements. Examples of such primitives are empty? and count. You’ve leveraged Clojure’s own functions here, but there’s nothing to stop you from adding more. You’ll actually be adding a function called matches? to the example shortly.

The final factor is combinability, or how the data elements and the language primitives can be combined to create more complex forms. Here again you can use all of Clojure’s built-in facilities. For example, in the previous examples, you used and and >.

In the next section, you’ll focus on creating new primitives, and then you’ll write code to execute the DSL. Such new primitives will make the DSL more powerful and expressive.

Adding primitives to the execution engine

As you can imagine, matches? is a function. For the purposes of this example here, it can be as simple as this:

(defn matches? [^String superset ^String subset]
  (and
   (not (empty? superset))
   (> (.indexOf superset subset) 0)))

You can add more functions such as this one, and they can be as complex as needed. The user of the DSL doesn’t need to know how they’re implemented, because they’ll be described as the primitives of the DSL.

Now, let’s go ahead and define the remainder of the execution engine. The first piece is a function to load up with the DSL program. Typically, this will be some text either written by a user or generated by another program such as a graphical rules editor. Given that ultimately the DSL is Clojure code, you can use load-string to load it. Consider the following code:

(ns clj-in-act.ch11.engine
  (:use clj-in-act.ch11.segment
        clj-in-act.ch11.session
        clj-in-act.ch11.dsl-store))
(defn load-code [code-string]
  (binding [*ns* (:ns (meta #'load-code))]
    (load-string code-string)))

Note that the load-code function first switches the namespace to its own (using metadata available on the load-code var) because all supporting functions are available in it. This way, load-code can be called from anywhere, and all supporting functions can be found. It then calls load-string.

The next step is to execute a segment function and to see if it returns true or false. A true value means that the user belongs to that segment. The following function checks this:

(defn segment-satisfied? [[segment-name segment-fn]]
  (if (segment-fn)
    segment-name))

You now have all the pieces to take a bunch of segment definitions and classify a user into one or more of them (or none of them). Consider the classify function:

(defn classify []
  (->> (all-segments)
       (map segment-satisfied?)
       (remove nil?)))

The complete source of the engine namespace is shown in the following listing.

Listing 11.5. Simple DSL execution engine to classify users into segments

(ns clj-in-act.ch11.engine
  (:use clj-in-act.ch11.segment
        clj-in-act.ch11.session
        clj-in-act.ch11.dsl-store))
(defn load-code [code-string]
  (binding [*ns* (:ns (meta #'load-code))]
    (load-string code-string)))
(defn matches? [^String superset ^String subset]
  (and
   (not (empty? superset))
   (> (.indexOf superset subset) 0)))
(defn segment-satisfied? [[segment-name segment-fn]]
  (if (segment-fn)
    segment-name))
(defn classify []
  (->> (all-segments)
       (map segment-satisfied?)
       (remove nil?)))

Now you’ll test it at the REPL. You’ll begin by creating a string that contains the definitions of the two segments in your new DSL:

(def dsl-code (str
  '(defsegment googling-clojurians
     (and
      (> (count $search-terms) 0)
      (matches? $url-referrer "google")))
  '(defsegment loyal-safari
     (and
      (empty? $url-referrer)
      (= :safari $user-agent)))))
;=> #'user/dsl-code

Next, you’ll bring in your little DSL engine:

(use 'clj-in-act.ch11.engine)
;=> nil

It’s now easy to load up the segment definitions:

(load-code dsl-code)
;=> {:segments {:loyal-safari #<engine$eval3399$segment_fn__2833_
TRUNCATED OUTPUT

To test classification, you’re going to need a user session and Redis running. You can set up a session for testing purposes by defining one at the REPL as follows:

(def abc-session {
    :consumer-id "abc"
    :url-referrer "http://www.google.com/search?q=clojure+programmers"
    :search-terms ["clojure" "programmers"]
    :ip-address "192.168.0.10"
    :tz-offset 480
    :user-agent :safari})
;=> #'user/abc-session

Now put it into Redis:

(require 'redis) (use 'clj-in-act.ch11.session)
;=> nil
(redis/with-server {:host "localhost"}
  (save-session abc-session))
;=> "OK"

Everything is set up now, and you can test segmentation:

(redis/with-server {:host "localhost"}
  (in-session "abc"
    (println "The current user is in:" (classify))))
The current user is in: (:googling-clojurians)
;=> nil

It works as expected. Note that the classify function returns a lazy sequence that’s realized by the call to println. If you were to omit that, you’d need a doall to see it at the REPL; otherwise, it will complain about the *session* var not being bound.

With this, you have the basics working end to end. Expanding the DSL is as easy as adding new data elements and new primitives such as the matches? function. You can also expand the $attribute syntax by doing more in the postwalk transformation. Before addressing updating rules, you’ll add a way to name the abstractions you’re defining and allow for segments to be reused.

Increasing combinability

Imagine that you’d like to narrow the scope of the googling-clojurians crowd. You’d like to know which of these folks are also using the Chrome browser. You could create a segment as follows:

(defsegment googling-clojurians-chrome
     (and
      (> (count $search-terms) 0)
      (matches? $url-referrer "google")
      (= :chrome $user-agent)))

This will work fine, but it has the obvious problem that two of the three conditions are duplicated in the googling-clojurians segment. In a normal programming language, creating a named entity and replacing the duplicate code in both places with that entity can remove such duplication. For example, you could create a Clojure function and call it from both places.

If you do that, you’ll expose the lower-level details of the implementation of your DSL to the eventual users of the DSL. It would be ideal if you could hide that detail while letting them use named entities. Consider this revised implementation of def-segment:

(defmacro defsegment [segment-name & body]
  (let [transformed (postwalk transform-lookups body)]
    `(let [segment-fn#  (fn [] ~@transformed)]
       (register-segment ~(keyword segment-name) segment-fn#)
       (def ~segment-name segment-fn#))))

The change made here does what you talked about doing by hand. The definition of a segment now also creates a var by the same name. It can be used as follows:

(defsegment googling-clojurians-chrome
     (and
      (googling-clojurians)
      (= :chrome $user-agent)))

This is equivalent in functionality to the previous definition of this segment, with the duplication removed. This is an example of increasing the combinability of domain-specific entities, where segment definitions are built on top of the lower-level session-lookup primitives, combined with built-in logical operators. Note that because your DSL code is all executed within a single namespace, you have a single namespace going. This could cause problems with name conflicts, and this may need to be addressed, depending on the requirements.

Another example of a language-level construct is in-session, which given a customer ID sets up the execution context for classification. It abstracts away the details of where the session is stored and how to access and load it.

Although this is a small example, we’ve explored several of the concepts we talked about in the opening discussion. The last step will be to look at how the DSL can be updated dynamically.

Dynamic updates

With the DSL, you’ve exposed a linguistic layer to the code that follows. You also would like to add dynamic updates to the rules. You’ve already seen that, but we didn’t focus on it. Consider again a definition such as this:

(defsegment googling-clojurians
    (and
     (> (count $search-terms) 0)
     (matches? $url-referrer "yahoo")))

You know that evaluating this code will change the definition of the segment known as googling-clojurians (not to mention that it’s named incorrectly, because Yahoo! search is being used). But the following code has the same effect:

(load-code (str '(defsegment googling-clojurians
     (and
      (> (count $search-terms) 0)
      (matches? $url-referrer "yahoo")))))

Notice that load-code accepts a string. This DSL code snippet can be created anywhere, even from outside your execution engine. It could be created, say, from a text editor and loaded in via a web service.

Let’s take another example by imagining you had a set of remote worker processes that implemented your rule engine to classify users into segments. You can imagine classify being implemented using def-worker. When sent a request, it will access a commonly available Redis server, find the specified user session, and classify the user into segments. This is no different from what you’ve seen earlier, except for the fact that this code would run on multiple remote servers.

Now, imagine load-code also being implemented as a def-worker. In this scenario, not only could you remotely load DSL code, but you could also use run-worker-everywhere to broadcast DSL updates across all remote workers. You’d get the ability to update the segmentation cluster in real time, with no code to deploy.

We’ll end this section with one last point. We haven’t addressed error checking the DSL code so far, and in a production system you’d definitely need to do that. You’ve also built quite a minimal DSL, and you could certainly make it arbitrarily powerful. Being able to use the full Clojure language inside it is a powerful feature that can be used by power users if so desired. As the capability of the DSL itself is expanded to do more than segmentation, the ability to update running code in such a simple way as described previously could prove to be useful.

11.6. Summary

When most people start out with the Lisp family of programming languages, they first ask about the odd syntax. The answer to that question is the macro system. In that sense, we’ve come full circle. Macros are special because they make Clojure a programmable programming language. They allow the programmer to mold the core language into one that suits the problem at hand. In this way, Clojure blurs the line between the designers of the language itself and the programmer.

This chapter started with a few advanced uses of Clojure macros. Anaphoric macros aren’t used a lot, and they certainly come with their gotchas, but when applied carefully, they can result in truly elegant solutions. Similarly, moving computation into the compile phase of your program seems like something that isn’t done often. Certainly, the example we looked at gives only a glimpse into what’s possible. It’s an important technique, though, that can be effective when needed. Finally, macros that define other macros threaten to send you down the rabbit hole. Understanding such use of the macro system is the only way to true Lisp mastery.

Lisp encourages a certain style of programming. Everyone seems to be talking about domain-specific languages these days, but in Clojure, it’s the normal way to build programs. You’ve written code similar to the behavioral targeting DSL example throughout the book, be it the mocking framework to help you write tests in chapter 10, the object system utility in chapter 8, or the faux protocols library called modus operandi in chapter 9. Some people express concern about the misuse of macros, but the real concern should be an incomplete understanding of the Lisp way.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 11. More macros and DSLs

Create new playlist

Sign In

Sign Up

Chapter 11. More macros and DSLs

11.1. A quick review of macros

11.2. Anaphoric macros

11.2.1. The anaphoric if

Implementing anaphoric-if

Note

Generalizing the anaphoric if

11.2.2. The thread-it macro

Threading in any position

Implementing thread-it

11.3. Shifting computation to compile time

11.3.1. Example: Rotation ciphers without macros

About the ROT13 cipher

Table 11.1. Alphabet rotated by 13 places

Generalized rotation ciphers

Listing 11.1. A general rotation cipher system to implement things like ROT13

11.3.2. Making the compiler work harder

11.4. Macro-generating macros

11.4.1. Example template

11.4.2. Implementing make-synonym

11.4.3. Why macro-generating macros

11.5. Domain-specific languages

11.5.1. DSL-driven design

Design consideration #1: Decomposition

Top-down decomposition

Bottom-up decomposition

Design consideration #2: Combinability

11.5.2. User classification

Data element

User session persistence

Listing 11.2. Basic functions to handle session persistence in Redis

Segmenting users

Listing 11.3. dsl-store namespace for storing the rules as anonymous functions

Listing 11.4. Segmentation DSL defined using a simple macro

Fine-tuning: The power of the DSL

Adding primitives to the execution engine

Listing 11.5. Simple DSL execution engine to classify users into segments

Increasing combinability

Dynamic updates

11.6. Summary

Table of Contents for
Chapter 11. More macros and DSLs

Listing 11.3. `dsl-store` namespace for storing the rules as anonymous functions