Chapter 3. Building blocks of Clojure

This chapter covers

  • Clojure metadata
  • Java exceptions
  • Higher-order functions
  • Scoping rules
  • Clojure namespaces
  • Clojure’s destructuring feature
  • Clojure’s reader literals

When people are good at something already (such as a programming language), and they try to learn something new (such as another programming language), they often fall into what Martin Fowler (martinfowler.com/bliki/Improvement-Ravine.html) calls an “improvement ravine.” For programming, the ravine refers to the drop in productivity experienced when one has to relearn how to do things in the new language. We’ve all been guilty of switching back to a language we’re already good at to get the job done. It sometimes takes several attempts to get over enough of the ravine to accomplish simple things. The next few chapters aim to do that—we’ll review the basics of Clojure in more detail. After reading them, you’ll be comfortable enough to solve problems of reasonable complexity. We’ll also cover most of the remaining constructs of the language, many of which will be familiar to you if you use other common languages.

First, we’ll examine metadata, which is a unique way to associate additional data with an ordinary Clojure value without changing the value. Next, we’ll show you another piece of Java interop: exception handling and throwing.

Then in the meat of the chapter we’ll examine functions in some detail. Lisp was born in the context of mathematics, and functions are fundamental to it. Clojure uses functions as building blocks, and thus mastering functions forms the basis of learning Clojure. We’ll then look at how namespaces help organize large programs. These are similar to Java packages; they’re a simple way to keep code organized by dividing the program into logical modules.

The next section will examine vars (those things created by def) in detail and how to use them effectively. The following section is about destructuring, something that’s rather uncommon in most languages. Destructuring is a neat way of accessing interesting data elements from inside larger data structures.

Finally, we’ll conclude this chapter by taking a look at reader literals, which will allow you to add your own convenience syntax for data literals. Without any further ado, let’s review how Clojure creates and uses functions.

3.1. Metadata

Metadata means data about data. Clojure supports tagging data (for example, maps, lists, and vectors) with other data without changing the value of the tagged data. What this means specifically is that the same values with different metadata will still compare equal.

The point of using immutable values instead of mutable objects is that you can easily compare values by their content instead of their identity. The two vectors [1 2 3] and [1 2 3] are the same even if they have different addresses in computer memory, so it doesn’t matter which one your program uses. But in the real world you often need to distinguish between otherwise identical things in meaningful ways. For example, one value may compare equal to another, but it makes a difference if one value came from an untrusted network source or a file with a specific name. Metadata provides a way to add identity to values when it matters.

For example, you’ll use the tags :safe and :io to determine if something is considered a security threat and if it came from an external I/O source. Here’s how you might use metadata to represent such information:

(def untrusted (with-meta {:command "delete-table" :subject "users"}
                          {:safe false :io true}))

Now the map with keys :command and :subject has a metadata map attached to it with keys :safe and :io. Metadata is always a map. Note that the metadata map is attached on the “outside” of the object with metadata: :safe and :io aren’t ever added as keys to the original map.

You can also define metadata with a shorthand syntax using the reader macro ^{}. This example is exactly the same as the previous one except the metadata is added at read time instead of eval time:

(def untrusted ^{:safe false :io true} {:command "delete-table"
                                        :subject "users"})

The read-time verses eval-time distinction is important: the following example is not the same as using vary-meta:

(def untrusted ^{:safe false :io true} (hash-map :command "delete-table"
                                                 :subject "users")

This associates metadata with the list starting with hash-map, not the hash map that a function call produces, so this metadata becomes invisible at runtime.

Objects with metadata can be used like any other objects. The additional metadata doesn’t affect their values. In fact, if you were to check what untrusted was at the read-evaluate-print loop (REPL), the metadata won’t even appear:

untrusted
;=> {:command "delete-table", :subject "users"}

As mentioned earlier, metadata doesn’t affect value equality; therefore, untrusted can be equal to another map that doesn’t have any metadata on it at all:

If you want to examine the metadata associated with the value, you can use the meta function:

(meta untrusted)
;=> {:safe false, :io true}
(meta trusted)
;=> nil

When new values are created from those that have metadata, the metadata is copied over to the new data. This is to preserve the identity semantics of metadata, for example:

(def still-untrusted (assoc untrusted :complete? false))
;=> #'user/still-untrusted
still-untrusted
;=> {:complete? false, :command "delete-table", :subject "users"}
(meta still-untrusted)
;=> {:safe false, :io true}

Functions and macros can also be defined with metadata. Here’s an example:

(defn ^{:safe true :console true
        :doc "testing metadata for functions"}
  testing-meta
  []
  (println "Hello from meta!"))

Now try using the meta function to check that the metadata was set correctly:

(meta testing-meta)
;=> nil

This returns nil because the metadata is associated with the var testing-meta and not the function itself. To access the metadata, you’d have to pass the testing-meta var to the meta function. You can do this as follows:

(meta (var testing-meta))
;=> {:ns #<Namespace user>,
     :name testing-meta,
     :file "NO_SOURCE_FILE",
     :line 1, :arglists ([]),
     :console true,
     :safe true,
     :doc "testing metadata for functions"}

You’ll learn more about vars and functions later in this chapter.

Metadata is useful in many situations where you want to tag things for purposes orthogonal to the data they represent. Such annotations are one example where you might perform certain tasks if objects are annotated a certain way, such as if their metadata contains a certain key and value. By the way, this may seem similar to Java’s annotations, but it’s much better. For instance, in Clojure, nearly anything can have metadata, unlike in Java where only classes and methods can have annotations. Sadly, you can’t add Clojure metadata to native Java types such as strings.

Clojure internally uses metadata quite a lot; for example, the :doc key is used to hold the documentation string for functions and macros, the :macro key is set to true for functions that are macros, and the :file key is used to keep track of what source file something was defined in.

3.1.1. Java type hints

One of the pieces of metadata you may encounter often when making Java method calls from Clojure is a Java type hint, which is stored in the meta key :tag. It’s used often enough that it has its own reader macro syntax: ^symbol. Why do you need this?

When you make a Java method call using interop, the Java virtual machine (JVM) needs to know what class defines a method name so it can find the implementation of a method in the class. In Java this is normally not a problem because most types are annotated in the Java code and verified at compile time. Clojure is dynamically typed, however, so often the type of a variable isn’t known until runtime. In these cases the JVM needs to use reflection to determine the class of an object at runtime and find the correct method to call. This works fine but can be slow. Here’s an example of the problem:

The last line demonstrates how Clojure stores type hints on function arguments: here you inspect the metadata on fast-string-length, get its :arglists (list of the signatures of all a function’s arities), and get metadata on the x symbol in the argument list.

Clojure’s compiler is pretty smart about inferring types, and all core Clojure functions are already type-hinted where necessary, so it’s not that often that you’ll need to resort to type hints. The idiomatic approach is to write all your code without type hints and then (set! *warn-on-reflection* true) and keep reevaluating your namespace and adding hints one at a time until the reflection warnings go away. If you concentrate type hinting on function arguments and return values, Clojure will often figure out all the types in the function body for you. You can read all the details of type hinting (including how to hint function return values) in Clojure’s documentation at http://clojure.org/java_interop#Java%20Interop-Type%20Hints.

3.1.2. Java primitive and array types

Java has some special types called primitives that aren’t full-fledged objects and that get special treatment by the JVM to increase speed and save memory (http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html). They’re recognized by their lowercase type names in Java documentation: byte, short, int, long, float, double, boolean, and char. These are sometimes called unboxed types because they don’t have an object “box” around them or any object methods.[1] Java arrays are fixed-length homogenous containers for other types; they’re also primitive types, and there’s even a different array type for each possible thing the array can hold!

1

Java will automatically “box” primitive types with a corresponding object wrapper (for example, a long with a java.lang.Long) when you call a method on the primitive. See http://docs.oracle.com/javase/tutorial/java/data/autoboxing.html for details.

Primitives don’t have a pronounceable class name to refer to them, so it’s not obvious how to type hint them. Fortunately, Clojure defines aliases for all the primitive types and for all arrays of primitive types: just use a type hint like ^byte for the primitive and the plural form ^bytes for the array-of-primitive.

But you may occasionally need to type hint an array of Java objects. In this case you need to do some magic to find the strange class name:

Pretty much the only time you’ll need to know the class name of an object array is when you’re writing classes and interfaces in Clojure that are meant to be used by Java code and you need to accept or return an array of objects. We’ll cover this topic in chapter 6.

3.2. Java exceptions: try and throw

Java has exceptions, as you’ve seen, but until now we haven’t mentioned how to manipulate them in Clojure. If an expression has the potential to throw an exception, a try/catch/finally block can be used to catch it and decide what to do with it.[2] Suppose you have a function that calculates the average of a collection of numbers:

2

If you’re familiar with Java, note that Clojure doesn’t have checked exceptions. Catching and handling exceptions are always optional in Clojure.

(defn average [numbers]
  (let [total (apply + numbers)]
    (/ total (count numbers))))

If you call the average function with an empty collection, you get an exception:

(average [])
ArithmeticException Divide by zero  clojure.lang.Numbers.divide (Numbers.java:156)

Normally you’d check for the empty collection, but you can add a try/catch block to illustrate:

The general form of using try/catch/finally is straightforward:

(try expr* catch-clause* finally-clause?)

The form accepts multiple expressions as part of the try clause and multiple catch clauses. The finally clause is optional. The expressions passed to the try clause are evaluated one by one, and the value of the last is returned. If any of them generate an exception, the appropriate catch clause is executed based on the type (Java class) of the exception, and the value of that is then returned. The optional finally clause is always executed for any side effects that need to be guaranteed, but nothing is ever returned from it. For example:

Notice that the RuntimeException catch clause matched, not the ArithmeticException clause, even though this exception type is a better match. The reason is that catch clauses are tried in order and the first possible match is used. ArithmeticException is a kind of RuntimeException, so the RuntimeException test matched and that catch clause was executed. You should arrange your catch clauses from most specific to least specific exception type to avoid confusion about which clause will match.

Exceptions can be thrown as easily using the throw form. In any place where you wish to throw an exception, you can do something like the following:

(throw (Exception. "this is an error!"))
Exception this is an error! user/eval807 (NO_SOURCE_FILE:1)

throw accepts a java.lang.Throwable instance, so any kind of exception can be thrown using it.

That covers the basics of using the try/catch/finally form as well as throwing exceptions. This isn’t a commonly used feature of the Clojure language because there are several helper macros that take care of many situations where you might need to use this form. You’ll see this more in chapter 5.

3.3. Functions

As discussed in chapter 1, Lisp was born in the context of mathematics and is a functional language. A functional language, among other things, treats functions as first-class elements. This means that the following things are true:

  • Functions can be created dynamically (at runtime).
  • Functions can be accepted as arguments by other functions.
  • Functions can be returned from functions as return values.
  • Functions can be stored as elements inside other data structures (for example, lists).

You’ve been creating and using functions for at least a chapter, but we haven’t yet fully explored all their features and uses. This section will give you a more detailed look at what functions are and how they work, and you’ll see several code examples that illustrate their more advanced uses. We’ll begin by defining simple functions, with both fixed and variable number of parameters. After that, we’ll examine anonymous functions and a few shortcuts for using them, followed by using recursion as a means of looping. We’ll end the section with a discussion on higher-order functions and closures. To get started, let’s examine the means that Clojure provides to define your own functions.

3.3.1. Defining functions

Functions are defined using the defn macro. The syntax of the defn macro is

(defn function-name
  doc-string?
  metadata-map?
  [parameter-list*]
  conditions-map?
  body-expressions*)

Here, the symbols with a question mark at the end are optional. In other words, although function-name and parameters are required, doc-string, metadata-map, and conditions-map are optional. Before discussing more details of this structure, let’s take a quick look at an example. This example is quite basic; all it does is accept an item cost and the number of items and returns a total by multiplying them together:

(defn total-cost [item-cost number-of-items]
  (* item-cost number-of-items))

Here, total-cost is the name of the new function defined. It accepts two parameters: item-cost and number-of-items. The body of the function is the form that’s the call to the multiply function (*), which is passed the same two arguments. There’s no explicit return keyword in Clojure; instead, function bodies have an implicit do block surrounding them, meaning that the value of the last expression in the function body is returned. By the way, you can just type these examples (and the others in this book) at the REPL to see how they work.

Notice that defn is described as a macro. There’s a whole chapter on macros coming up (chapter 7), but it’s worth mentioning that the defn form expands to a def. For example, the definition of the previous total-cost function is expanded to this:

(def total-cost (fn [item-cost number-of-items]
                  (* item-cost number-of-items)))

total-cost is what Clojure calls a var. Note also that the function is in turn created using the fn macro. Because creating such vars and pointing them to a function is common, the defn macro was included in the language as a convenience.

If you wanted to, you could add a documentation string to the function, by passing a value for the doc-string parameter you saw earlier:

(defn total-cost
  "return line-item total of the item and quantity provided"
  [item-cost number-of-items]
  (* item-cost number-of-items))

In addition to providing a comment that aids in understanding this function, the documentation string can later be called up using the doc macro. This is because the doc-string is simply syntactic sugar to add the :doc key to the function var’s metadata that the doc macro can then read. To see this, type the following at the REPL:

(meta #'total-cost)
;=> {:ns #<Namespace user>, :name total-cost, :file "NO_SOURCE_FILE",
        :column 1, :line 1, :arglists ([item-cost number-of-items]),
        :doc "return line-item total of the item and quantity provided"}
(doc total-cost)
-------------------------
user/total-cost
([item-cost number-of-items])
 return line-item total of the item and quantity provided
;=> nil

If you hadn’t defined the function with the doc-string, then doc wouldn’t have been able to return any documentation other than the function name and parameter list.

The metadata-map is rarely seen outside of macros that define functions, but it’s good to know what it means. It’s simply another way to add metadata to the defined var. Observe what happens when you use the metadata-map field:

As you can see, adding metadata to the function name symbol and using a metadata-map are exactly equivalent. You can even use multiple ways of defining metadata at the same time: metadata farther to the right (like doc-string and metadata-map) will overwrite the same keys defined to their left.

Recall the general form of the defn macro from earlier. It has an optional conditions-map, and you’ll now see what it’s used for. Consider the following function definition:

(defn item-total [price quantity discount-percentage]
  {:pre [(> price 0) (> quantity 0)]
   :post [(> % 0)]}
  (->> (/ discount-percentage 100)
       (- 1)
       (* price quantity)
       float))

Here, item-total behaves as a normal function that applies a simple formula to the arguments and returns a result. Remember, you saw the thread-last (->>) operator in the previous chapter. At runtime, this function runs additional checks as specified by the hash map with the two keys :pre and :post. The checks it runs before executing the body of the function are the ones specified with the :pre key (hence called preconditions). In this case, there are two checks: one that ensures that price is greater than zero and a second that ensures that quantity is also greater than zero.

Try it with valid input:

(item-total 100 2 0)
;=> 200.0
(item-total 100 2 10)
;=> 180.0

Now try it with invalid input:

(item-total 100 -2 10)
AssertionError Assert failed: (> quantity 0)  user/item-total

Note that in this case, the function didn’t compute the result but instead threw an AssertionError error with an explanation of which condition failed. The Clojure runtime automatically takes care of running the checks and throwing an error if they fail.

Now let’s look at the conditions specified by the :post key, called the postconditions. The % in these conditions refers to the return value of the function. The checks are run after the function body is executed, and the behavior in the case of a failure is the same: an AssertionError is thrown along with a message explaining which condition failed. Here’s an example of a corner case that you may have forgotten to check for, but luckily you had a postcondition to catch it:

(item-total 100 2 110)
AssertionError Assert failed: (> % 0)  user/item-total

Now that you’ve seen how to add preconditions and postconditions to your functions, we’re ready to move on. We’ll next look at functions that can accept different sets of parameters.

Multiple arity

The arity of a function is the number of parameters it accepts. Clojure functions can be overloaded on arity, meaning that you can execute a different function body depending on the number of parameters the function was called with. To define functions with such overloading, you can define the various forms within the same function definition as follows:

(defn function-name
  ;; Note that each argument+body pair is enclosed in a list.
  ([arg1]      body-executed-for-one-argument-call)
  ([arg1 arg2] body-executed-for-two-argument-call)
  ;; More cases may follow.
)

Let’s look at an example:

(defn total-cost
  ([item-cost number-of-items]
    (* item-cost number-of-items))
  ([item-cost]
    (total-cost item-cost 1)))

Here, two arities of the total-cost function are defined. The first is of arity 2, and it’s the same as the one we defined earlier. The other is of arity 1, and it accepts only the first parameter, item-cost. Note that you can call any other version of the function from any of the other arities. For instance, in the previous definition, you call the dual-arity version of total-cost from the body of the single-arity one.

Variadic functions

We touched very briefly on variadic functions in chapter 2, but now we get the full story. A variadic function is a function with an arity that takes a variable number of arguments. Different languages support this in different ways; for example, C++ has the ellipsis, and Java has varargs. In Clojure, the same is achieved with the & symbol:

(defn total-all-numbers [& numbers]
  (apply + numbers))

Here, total-all-numbers is a function that can be called with any number of optional arguments. All the arguments are packaged into a single list called numbers, which is available to the body of the function. You can use this form even when you do have some required parameters. The general form of declaring a variadic function is as follows:

(defn name-of-variadic-function [param-1 param-2 & rest-args]
  (body-of-function))

Here, param-1 and param-2 behave as regular named parameters, and all remaining arguments will be collected into a list called rest-args. By the way, the apply function is a way of calling a function when you have the arguments inside a list. You’ll see this in more detail in section 3.3.2, “Calling functions.”

Notice that a variadic function can have other nonvariadic arities, too. (Nonvariadic arities are called fixed arities.) The only restriction is that the variadic arity must have at least as many required arguments as the longest fixed arity. For example, the following is a valid function definition:

(defn many-arities
  ([]             0)
  ([a]            1)
  ([a b c]        3)
  ([a b c & more] "variadic"))
;=> #'user/many-arities
(many-arities)
;=> 0
(many-arities "one argument")
;=> 1
(many-arities "two" "arguments")
ArityException Wrong number of args (2) passed to: user/many-arities  clojure.lang.AFn.throwArity (AFn.java:429)
(many-arities "three" "argu-" "ments")
;=> 3
(many-arities "many" "more" "argu-" "ments")
;=> "variadic"
Recursive functions

Recursive functions are those that either directly or indirectly call themselves. Clojure functions can certainly call themselves using their names, but this form of recursion consumes the stack. If enough recursive calls are made, eventually the stack will overflow. This is how things work in most programming languages. There’s a feature in Clojure that circumvents this issue. You’ll first write a recursive function that will blow the stack:

If you try calling count-down with a large number, for instance 100,000, you’ll get a StackOverflowError thrown at you:

(count-down 100000)
count-down: 100000
count-down: 99900
count-down: 99800
...
count-down: 90200
StackOverflowError   clojure.lang.Numbers$LongOps.remainder (Numbers.java:505)

You’ll now see how to ensure that this doesn’t happen.

In the last chapter, you saw the loop/recur construct that allowed you to iterate through sequences of data. The same recur form can be used to write recursive functions. When used in the “tail” position of a function body, recur binds its arguments to the same names as those specified in its parameter list. You’ll rewrite count-down using recur:

(defn count-downr [n]
  (when-not (zero? n)
    (if (zero? (rem n 100))
      (println "count-down:" n))
    (recur (dec n))))

This now works for any argument, without blowing the stack. The change is minimal because at the end of the function body, recur rebinds the function parameter n to (dec n), which then proceeds down the function body. When n finally becomes zero, the recursion ends. As you can see, writing self-recursive functions is straightforward. Writing mutually recursive functions is a bit more involved, and we’ll look at that next.

Mutually recursive functions

Mutually recursive functions are those that either directly or indirectly call each other. Let’s begin this section by examining an example of such a case. Listing 3.1 shows a contrived example of two functions, cat and hat, that call each other. Because cat calls hat before hat is defined, you need to declare it first. When given a large enough argument, they’ll throw the same StackOverflowError you saw earlier. Note that the declare macro calls def on each of its arguments. This is useful in cases where a function wants to call another function that isn’t defined yet, as is the case with a pair of mutually recursive functions in the following listing.

Listing 3.1. Mutually recursive functions that can blow the stack
(declare hat)
(defn cat [n]
  (when-not (zero? n)
    (when (zero? (rem n 100))
      (println "cat:" n))
    (hat (dec n))))

(defn hat [n]
  (when-not (zero? n)
    (if (zero? (rem n 100))
      (println "hat:" n))
    (cat (dec n))))

Let’s now fix this problem. You can’t use recur because recur is only useful for self-recursion. Instead, you need to modify the code to use a special Clojure function called trampoline. To do so, you’ll make a slight change to the definition of cat and hat. The new functions are shown in the following listing as catt and hatt.

Listing 3.2. Mutually recursive functions that can be called with trampoline
(declare hatt)
(defn catt [n]
  (when-not (zero? n)
    (when (zero? (rem n 100))
      (println "catt:" n))
    (fn [] (hatt (dec n)))))

(defn hatt [n]
  (when-not (zero? n)
    (when (zero? (rem n 100))
      (println "hatt:" n))
    (fn [] (catt (dec n)))))

The difference is so minor that you could almost miss it. Consider the definition of catt, where instead of making the recursive call to hatt, you now return an anonymous function that when called makes the call to hatt. The same change is made in the definition of hatt. You’ll learn more about anonymous functions in section 3.3.5.

Because these functions no longer perform their recursion directly, you have to use another function to call them. A function that accepts another function as an argument is called a higher-order function. The higher-order function you need here is trampoline, and here’s an example of using it:

(trampoline catt 100000)
catt: 100000
catt: 99900
...
catt: 200
catt: 100
;=> nil

This doesn’t blow the stack and works as expected. Internally, trampoline works by calling recur. Here’s the implementation:

Notice that trampoline is a higher-order function that sets up a local recursion point using the let form. It executes the function represented by the argument f and calls recur whenever the return value is itself a function. You could have done this yourself, but conveniently trampoline is available to you as part of the core set of Clojure functions.

You’ve now seen how recursive functions can be written in Clojure. Although using recur and trampoline is the correct and safe way to write such functions, if you’re sure that your code isn’t in danger of consuming the stack, it’s okay to write them without using these. Now that you’ve seen the basics of defining functions, let’s look at a couple of ways to call them.

3.3.2. Calling functions

Because functions are so fundamental to Clojure, you’ll be calling a lot of functions in your programs. The most common way of doing this looks similar to the following:

(+ 1 2 3 4 5)
;=> 15

Here, the symbol + represents a function that adds its arguments. As a side note, Clojure doesn’t have the traditional operators present in other languages. Instead, most operators are defined in Clojure itself, as any other function or macro. Coming back to the previous example, the + function is variadic, and it adds up all the parameters passed to it and returns 15.

There’s another way to evaluate a function. Let’s say someone handed you a sequence called list-of-expenses, each an amount such as 39.95M. In a language such as Java, you’d have to perform some kind of iteration over the list of expense amounts, combined with collecting the result of adding them. In Clojure, you can treat the list of numbers as arguments to a function like +. The evaluation, in this case, is done using a higher-order function called apply:

(apply + list-of-expenses)

The apply function is extremely handy, because it’s quite common to end up with a sequence of things that need to be used as arguments to a function. This is because a lot of Clojure programs use the core sequence data structures to do their job.

As you saw, apply is a higher-order function that accepts another function as its first parameter. Higher-order functions are those that accept one or more functions as parameters, or return a function, or do both. You’ll now learn a bit more about this powerful concept by looking at a few examples of such functions provided by Clojure.

3.3.3. Higher-order functions

As we discussed in the previous chapter, functions in Clojure are first-class entities. Among other things, this means that functions can be treated similarly to data: they can be passed around as arguments and can be returned from functions. Functions that do these things are called higher-order functions.

Functional code makes heavy use of higher-order functions. The map function that you saw in chapter 2 is one of the most commonly used higher-order functions. Other common ones are reduce, filter, some, and every?. You saw simple examples of map, reduce, and filter in chapter 2. Higher-order functions aren’t just convenient ways of doing things such as processing lists of data but are also the core of a programming technique known as function composition. In this section, we’ll examine a few interesting higher-order functions that are a part of Clojure’s core library.

every?

every? is a function that accepts a testing function that returns a boolean (such functions are called predicate functions) and a sequence. It then calls the predicate function on each element of the provided sequence and returns true if they all return a truthy value; otherwise, it returns false. Here’s an example:

This returns false because not every value in the bools vector is true.

some

some has the same interface as every?—that is, it accepts a predicate and a sequence. It then calls the predicate on each element in the sequence and returns the first logically true value it gets. If none of the calls return a logically true value, some returns nil. Here’s an example, which is a quick-and-dirty way to check if a particular value exists in a sequence:

(some (fn [p] (= "rob" p)) ["kyle" "siva" "rob" "celeste"])
;=> true returns true

This returns true because it returns the first logically true value returned by applying the anonymous function to each of the elements of the vector. In this case, the third element of the vector is "rob", and that returns true.

constantly

constantly accepts a value v and returns a variadic function that always returns the same value v no matter what the arguments. It’s equivalent to writing (fn [& more] v). Here’s an example:

(def two (constantly 2)) ; same as (def two (fn [& more] 2))
                         ; or      (defn two [& more] 2)
;=> #'user/two
(two 1)
;=> 2
(two :a :b :c)
;=> 2

two is a function that returns 2, no matter what or how many arguments it’s called with.

constantly is useful when a function requires another function but you just want a constant value.

complement

complement is a simple function that accepts a function and returns a new one that takes the same number of arguments, does the same thing as the original function does, but returns the logically opposite value.

For instance, consider a function that checks if the first of two arguments is greater than the second:

(defn greater? [x y]
   (> x y))

And here it is in action:

(greater? 10 5)
;=> true
(greater? 10 20)
;=> false

Now, if you wanted to write a function that instead checked if the first of two arguments was smaller than the second, you could implement it in a similar way, but you could also just use complement:

(def smaller? (complement greater?))

And in use:

(smaller? 10 5)
;=> false

(smaller? 10 20)
;=> true

It’s a convenient function that in certain cases lets you implement one side of logical scenarios and declare the other side as being opposite.

comp

comp, short for composition, is a higher-order function that accepts multiple functions and returns a new function that’s a composition of those functions. The computation goes from right to left—that is, the new function applies its arguments to the rightmost of the original constituent functions, then applies the result to the one left of it, and so on, until all functions have been called. Here’s an example:

(def opp-zero-str (comp str not zero?))

Here are examples of using this:

(opp-zero-str 0)
;=> "false"

(opp-zero-str 1)
;=> "true"

Here, opp-zero-str when called with 1 first applies the function zero? to it, which returns false; it then applies not, which returns true, and then applies str, which converts it to a string "true".

partial

partial, which is short for partial application, is a higher-order function that accepts a function f and a few arguments to f but fewer than the number f normally takes. partial then returns a new function that accepts the remaining arguments to f. When this new function is called with the remaining arguments, it calls the original f with all the arguments together. Consider the following function that accepts two parameters, threshold and number, and checks to see if number is greater than threshold:

(defn above-threshold? [threshold number]
  (> number threshold))

To use it to filter a list, you might do this:

(filter (fn [x] (above-threshold? 5 x)) [ 1 2 3 4 5 6 7 8 9])
;=> (6 7 8 9)

With partial, you could generate a new function and use that instead:

(filter (partial above-threshold? 5) [ 1 2 3 4 5 6 7 8 9])
;=> (6 7 8 9)

The idea behind partial is to adapt functions that accept n arguments to situations where you need a function of fewer arguments, say n-k, and where the first k arguments can be fixed. This is the example you just saw previously, where a two-argument function above-threshold? was adapted to a situation that needed a single-argument function. For instance, you may want to use a library function that has a slightly different signature than what you require and needs to be adapted for your use.

memoize

Memoization is a technique that prevents functions from computing results for arguments that have already been processed. Instead, return values are looked up from a cache. Clojure provides a convenient memoize function that does this. Consider the following artificially slow function that performs a computation:

(defn slow-calc [n m]
  (Thread/sleep 1000)
  (* n m))

Calling it via a call to the built-in function time tells you how long it’s taking to run:

(time (slow-calc 5 7))
"Elapsed time: 1000.097 msecs"
;=> 35

Now, you can make this fast, by using the built-in memoize function:

(def fast-calc (memoize slow-calc))

For memoize to do its thing, you call fast-calc once with a set of arguments (say 5 and 7). You’ll notice that this run appears as slow as before, only this time the result has been cached. Now, you call it once more via a call to time:

(time (fast-calc 5 7))
"Elapsed time: 0.035 msecs"
;=> 35

This is pretty neat! Without any work at all, you’re able to substantially speed up the function.

But there’s a big caveat to memoize: the cache that backs memoize doesn’t have a bounded size and caches input and results forever. Therefore, memoize should only be used with functions with a small number of possible inputs or else you’ll eventually run out of memory. If you need more advanced memoization features (such as caches with bounded size or with eviction policies) look at the much more powerful clojure.core.memoize library at https://github.com/clojure/core.memoize. These are some examples of what higher-order functions can do, and these are only a few of those included with Clojure’s standard library. Next, you’ll learn more about constructing complex functions by building on smaller ones.

3.3.4. Writing higher-order functions

You can create new functions that use existing functions by combining them in various ways to compute the desired result. For example, consider the situation where you need to sort a given list of user accounts, where each account is represented by a hash map, and where each map contains the username, balance, and the date the user signed up. This is shown in the next listing.

Listing 3.3. Function composition using higher-order functions
(def users
  [{:username     "kyle"
    :firstname    "Kyle"
    :lastname     "Smith"
    :balance      175.00M             ; Use BigDecimals for money!
    :member-since "2009-04-16"}
   {:username     "zak"
    :firstname    "Zackary"
    :lastname     "Jones"
    :balance      12.95M
    :member-since "2009-02-01"}
   {:username     "rob"
    :firstname    "Robert"
    :lastname     "Jones"
    :balance      98.50M
    :member-since "2009-03-30"}])
(defn sorter-using [ordering-fn]
  (fn [collection]
    (sort-by ordering-fn collection)))
(defn lastname-firstname [user]
  [(user :lastname) (user :firstname)])
(defn balance [user] (user :balance))
(defn username [user] (user :username))
(def poorest-first (sorter-using balance))
(def alphabetically (sorter-using username))
(def last-then-firstname (sorter-using lastname-firstname))

Here, users is a vector of hash maps. Specifically, it contains three users representing Kyle, Zak, and Rob. Suppose you wanted to sort these users by their username. You can do this using the sort-by function: it’s a higher-order function the first argument of which is a key function. The key function must accept one of the items you’re sorting and return a key to sort it by. Let’s see this step by step. If you call username on every user, you get every user’s name:

This is the list that sort-by will see when sorting items. sort-by will sort the items as if it were instead sorting a list created by (map key-function items). Putting it all together:

(sort-by username users)
;=> ({:member-since "2009-04-16", :username "kyle", ...}
     {:member-since "2009-03-30", :username "rob",  ...}
     {:member-since "2009-02-01", :username "zak",  ...})

Notice that the order of users is now Kyle, Rob, and Zack.

But what if you want to create functions that always sort in a specific order, without having to specify an ordering function? That’s what the sorter-using function does. It accepts a key function called ordering-fn and returns a function that accepts a collection that it will always sort using sort-by and the original ordering-fn. Take a look at sorter-using again:

(defn sorter-using [ordering-fn]
  (fn [collection]
    (sort-by ordering-fn collection)))

You define sorter-using as a higher-order function, one that accepts another function called ordering-fn, which will be used as a parameter to sort-by. Note here that sorter-using returns a function, defined by the fn special form that you saw earlier. Finally, you define poorest-first and alphabetically as the two desired functions, which sort the incoming list of users by :balance and :username. This is as simple as calling sorter-using, thus:

(def poorest-first (sorter-using balance))

This is the same as

(defn poorest-first [users] (sort-by balance users))

Both produce a sequence of users sorted by balance:

(poorest-first users)
;=> ({:username "zak",  :balance  12.95M, ...}
     {:username "rob",  :balance  98.50M, ...}
     {:username "kyle", :balance 175.00M, ...})

But suppose you wanted to sort by two criteria: first by each user’s last name and then by first name if they share a first name with another user. You can do this by supplying an ordered collection as the sorting key for an item: sequences are sorted by comparing each of their members in order. For example, the lastname-firstname function returns a vector of the last then first name of a user:

(map lastname-firstname users)
;=> (["Smith" "Kyle"] ["Jones" "Zackary"] ["Jones" "Robert"])
(sort *1)
;=> (["Jones" "Robert"] ["Jones" "Zackary"] ["Smith" "Kyle"])

So you can use this to sort the full user records with the last-then-firstname function:

(last-then-firstname users)
;=> ({:lastname "Jones", :firstname "Robert",  :username "rob",  ...}
     {:lastname "Jones", :firstname "Zackary", :username "zak",  ...}
     {:lastname "Smith", :firstname "Kyle",    :username "kyle", ...})

The two functions username and balance are used in other places, so defining them this way is okay. If the only reason they were created was to use them in the definition of poorest-first and alphabetically, then they could be considered clutter. You’ll see a couple of ways to avoid the clutter created by single-use functions, starting with anonymous functions in the next section.

3.3.5. Anonymous functions

As you saw in the previous section, there may be times when you have to create functions for single use. A common example is when a higher-order function accepts another function as an argument. An example is the sorter-using function defined earlier. Such single-use functions don’t even need names, because no one else will be calling them. Therefore, instead of creating regular named functions that no one else will use, you can use anonymous functions.

You’ve seen anonymous functions before, even if we didn’t call them out. Consider this code snippet from earlier in the chapter:

(def total-cost
  (fn [item-cost number-of-items]
    (* item-cost number-of-items)))

As we discussed earlier, this code assigns a value to the total-cost var, which is the function created by the fn macro. To be more specific, the function by itself doesn’t have a name; instead, you use the var with the name total-cost to refer to the function. The function itself is anonymous. To sum up, anonymous functions can be created using the fn form. Let’s consider a situation where you need a sequence of dates of when your members joined (perhaps for a report). You can use the map function for this:

(map (fn [user] (user :member-since)) users)
;=> ("2009-04-16" "2009-02-01" "2009-03-30")

Here, you pass the anonymous function (which looks up the :member-since key from inside the users map) into the map function to collect the dates. This is a fairly trivial use case, but there are cases where this will be useful.

Before we move on, let’s look at a reader macro that helps with creating anonymous functions.

A shortcut for anonymous functions

We talked about reader macros in chapter 2. One of the reader macros provided by Clojure allows anonymous functions to be defined quickly and easily. The reader macro that does this is #(.

Here you’ll rewrite the code used to collect a list of member-joining dates using this reader macro:

That’s much shorter! The #(% :member-since) is equivalent to the anonymous function used in the previous version. Let’s examine this form in more detail.

The #(), with the body of the anonymous function appearing within the parentheses, creates an anonymous function. The % symbol represents a single argument. If the function needs to accept more than one argument, then %1, %2, ... can be used. The body can contain pretty much any code, except for nested anonymous functions defined using another #() reader macro. You can also use %& for the “rest” arguments for a variadic function: the rest arguments are those beyond the highest explicitly mentioned % argument. An example will clarify:

(#(vector %&) 1 2 3 4 5)
;=> [(1 2 3 4 5)]
(#(vector % %&) 1 2 3 4 5)
;=> [1 (2 3 4 5)]
(#(vector %1 %2 %&) 1 2 3 4 5)
;=> [1 2 (3 4 5)]
(#(vector %1 %2 %&) 1 2)
;=> [1 2 nil]

You’ll now see another way to write such functions—a way that will result in even shorter code.

3.3.6. Keywords and symbols

Keywords are identifiers that begin with a colon—examples are :mickey and :mouse. You learned about keywords and symbols in chapter 2. They’re some of the most heavily used values in Clojure code, and they have one more property of interest. They’re also functions, and their use as functions is quite common in idiomatic Clojure.

Keyword functions accept one or two arguments. The first argument is a map, and the keyword looks itself up in this map. For example, consider one of the user maps from earlier:

(def person {:username "zak"
             :balance 12.95
             :member-since "2009-02-01"})

To find out what username this corresponds to, you’d do the following:

(person :username)
;=> "zak"

But now that you know that keywords behave as functions, you could write the same thing as

(:username person)
;=> "zak"

This would also return the same "zak". Why would you want to do such a strange-looking thing? To understand this, consider the code you wrote earlier to collect a list of all dates when users signed up, from the previous example:

(map #(% :member-since) users)
;=> ("2009-04-16" "2009-02-01" "2009-03-30")

Although this is short and easy to read, you could now make this even clearer by using the keyword as a function:

(map :member-since users)
;=> ("2009-04-16" "2009-02-01" "2009-03-30")

This is much nicer! Indeed, it’s the idiomatic way of working with maps and situations such as this one. We said earlier that keyword functions could accept a second optional parameter. This parameter is what gets returned if there’s no value in the map associated with the keyword. As an example, consider the following two calls:

(:login person)
;=> nil
(:login person :not-found)
;=> :not-found

The first call returns nil because person doesn’t have a value associated with :login. But if nil was a legitimate value for some key in the hash map, you wouldn’t be able to tell if it returned that or if no association was found. To avoid such ambiguity, you’d use the second form shown here, which will return :not-found. This return value tells you clearly that there was nothing associated with the key :login.

Symbols

Now let’s talk about symbols. In Clojure, symbols are identifiers that represent some value. Examples are users and total-cost, which as you’ve seen represent the list of users and a function. A symbol is a name as a value. To use the analogy of a dictionary, the word in a dictionary entry is the symbol but the definition of the word is a binding of that word to a particular meaning. A word and its definition aren’t the same—for example, a word could have multiple definitions, or it could have a different definition at a different time. The same principle applies to symbols: the symbol user is always the same as another symbol user, but they could point to (that is, be bound to) different values. For example, one could point to a var containing a function, but another one could be a local variable pointing to a user hash map.

Normally when the Clojure runtime sees a symbol like users, it automatically evaluates it and uses the value that the symbol represents. But you may wish to use symbols as is. You may desire to use symbols themselves as values, for instance, as keys in a map (or indeed in some kind of symbolic computation). To do this, you’d quote the symbols. Everything else works exactly the same as in the case of keywords, including its behavior as a function. Here’s an example of working with a hash map:

(def expense {'name "Snow Leopard" 'cost 29.95M})
;=> #'user/expense
(expense 'name)
;=> "Snow Leopard"
('name expense)
;=> "Snow Leopard"
('vendor expense)
;=> nil
('vendor expense :absent)
;=> :absent

You can see here that symbols behave similarly to keywords in this context. The optional parameter that works as the default return value works just like in the case of Clojure keywords.

Furthermore, as you saw earlier in this chapter and also in the previous one, it turns out that maps and vectors have another interesting property, which is that they’re also functions. Hash maps are functions of their keys, so they return the value associated with the argument passed to them. Consider the example from earlier:

(person :username)
;=> "zak"

person is a hash map, and this form works because it’s also a function. It returns "zak". Incidentally, hash maps also accept an optional second parameter, which is what is returned when a value associated with the key isn’t found, for instance:

(person :login :not-found)
;=> :not-found

Vectors also behave the same way; they’re functions of their indices. Consider the following example:

(def names ["kyle" "zak" "rob"])
;=> #'user/names
(names 1)
;=> "zak"

The call to names returns "zak", and this works because the vector names is a function. Note here that vector functions don’t accept a second argument, and if an index that doesn’t exist is specified, an exception is thrown:

(names 10)
IndexOutOfBoundsException   clojure.lang.PersistentVector.arrayFor (PersistentVector.java:107)
(names 10 :not-found)
ArityException Wrong number of args (2) passed to: PersistentVector  clojure.lang.AFn.throwArity (AFn.java:429)

The fact that vectors and hash maps are functions is useful when code is designed with function composition in mind. Instead of writing wrapper functions, these data structures can themselves be passed around as functions. This results in cleaner, shorter code.

It’s worth spending some time and experimenting with the various ideas explored in this section, because any nontrivial Clojure program will use most of these concepts. A large part of gaining proficiency with a language like Clojure is understanding and gaining proficiency with functional programming. Functional programming languages are great tools for designing things in a bottom-up way, because small functions can easily be combined into more complex ones. Each little function can be developed and tested incrementally, and this also greatly aids rapid prototyping. Having a lot of small, general functions that then combine to form solutions to the specific problems of the domain is also an important way to achieve flexibility.

Having seen how functions work, you’re ready to tackle another important element in the design of Clojure programs: its scoping rules. In the next section, we’ll explore scope. Scoping rules determine what’s visible where, and understanding this is critical to writing and debugging Clojure programs.

3.4. Scope

Now that you’ve seen the basics of defining functions, we’ll take a bit of a detour and show how scope works in Clojure. Scope, as it’s generally known, is the enclosing context where names resolve to associated values. Clojure, broadly, has two kinds of scope: static (or lexical) scope and dynamic scope. Lexical scope is the kind that programming languages such as Java and Ruby offer. A lexically scoped variable is visible only inside the textual block that it’s defined in (justifying the term lexical) and can be determined at compile time (justifying the term static).

Most programming languages offer only lexical scoping, and this is the most familiar kind of scope. The Lisp family has always also offered special variables that follow a different set of rules for dynamic scope. We’ll examine both in this section. We’ll first explore vars and how they can operate as special variables with dynamic scope. Then, we’ll examine lexical scope and how to create new lexically scoped bindings.

3.4.1. Vars and binding

Vars in Clojure are, in some ways, similar to globals in other languages. Vars are defined at the top level of any namespace, using the def special form. Here’s an example:

(def MAX-CONNECTIONS 10)

After this call, the MAX-CONNECTIONS var is available to other parts of the program. Remember, def always creates the var at the level of the enclosing namespace no matter where it’s called. For instance, even if you call def from inside a function, it will create the var at the namespace level. For local variables, you’ll need the let form, which you’ve seen previously and which we’ll examine again shortly. The value of a var is determined by its binding. In this example, MAX-CONNECTION is bound to the number 10, and such an initial binding is called a root binding. A var can be defined without any initial binding at all, in the following form:

(def RABBITMQ-CONNECTION)

Here, RABBITMQ-CONNECTION is said to be unbound. If another part of the code tries to use its value, an exception will be thrown saying that the var is unbound. To set a value for an unbound var, or to change the value bound to a var, Clojure provides the binding form. Unfortunately, as defined previously, calling binding will throw an exception complaining that you can’t dynamically bind a non-dynamic var. To rebind vars, they need to be dynamic, which is done using the following metadata declaration:

(def ^:dynamic RABBITMQ-CONNECTION)
(binding [RABBITMQ-CONNECTION (new-connection)]
    (
       ;; do something here with RABBITMQ-CONNECTION
     ))

The general structure of the binding form is that it begins with the symbol binding, followed by a vector of an even number of expressions. The first of every pair in the vector is a var, and it gets bound to the value of the expression specified by the second element of the pair. Binding forms can be nested, which allows new bindings to be created within each other. You’ll see a running example in a moment when we discuss the implications of ^:dynamic.

By the way, if you do try to rebind a var that wasn’t declared dynamic, you’ll see this exception:

java.lang.IllegalStateException:
Can't dynamically bind non-dynamic var: user/RABBITMQ-CONNECTION

This should tip you off that you need to use the ^:dynamic metadata on the var in question.

As you saw earlier in this chapter, the defn macro expands to a def form, implying that functions defined using defn are stored in vars. Functions can thus be redefined using a binding form as well. This is useful for things like implementing aspect-oriented programming or stubbing out behavior for unit tests.

Special variables

There’s one thing to note about vars: when declared with the ^:dynamic metadata, they become dynamically scoped. To understand what this means, again consider the following var:

(def ^:dynamic *db-host* "localhost")

If you now call a function like expense-report, which internally uses *db-host* to connect to a database, you’ll see numbers retrieved from the local database. For now, test this with a function that prints the binding to the console:

(defn expense-report [start-date end-date]
  (println *db-host*)) ;; can do real work

Now, once you’ve tested things to your satisfaction, you can have the same code connect to the production database by setting up an appropriate binding:

(binding [*db-host* "production"]
    (expense-report "2010-01-01" "2010-01-07"))

This will run the same code as defined in the expense-report function but will connect to a production database. You can prove that this happens by running the previous code; you’d see "production" printed to the console.

Note here that you managed to change what the expense-report function does, without changing the parameters passed to it (the function connects to a database specified by the binding of the *db-host* var). This is called action at a distance, and it must be done with caution. The reason is that it can be similar to programming with global variables that can change out from underneath you. But used with caution, it can be a convenient way to alter the behavior of a function.

Such vars that need to be bound appropriately before use are called special variables. A naming convention is used to make this intent clearer: these var names begin and end with an asterisk. In fact, if you have warnings turned on, and you name a var with asterisks and don’t declare it dynamic, Clojure will warn you of possible trouble.

Dynamic scope

You’ve seen how vars in general (and special variables in specific) can be bound to different values. We’ll now explore the earlier statement that vars aren’t governed by lexical scoping rules. We’ll implement a simple form of aspect-oriented programming, specifically a way to add a log statement to functions when they’re called. You’ll see that in Clojure this ends up being quite straightforward, thanks to dynamic scope.

Scope determines which names are visible at certain points in the code and which names shadow which other ones. Lexical scope rules are simple to understand; you can tell the visibility of all lexically scoped variables by looking at the program text (hence the term lexical). Ruby and Java are lexically scoped.

Dynamic scope doesn’t depend on the lexical structure of code; instead, the value of a var depends on the execution path taken by the program. If a function rebinds a var using a binding form, then the value of the var is changed for all code that executes within that binding form, including other functions that may be called. This works in a nested manner, too. If a function were to then use another binding form later on in the call stack, then from that point on all code would see this second value of the var. When the second binding form completes (execution exits), the previous binding takes over again, for all code that executes from that point onward. Look at the contrived example in the following listing.

Listing 3.4. Dynamic scope in action
(def ^:dynamic *eval-me* 10)
(defn print-the-var [label]
  (println label *eval-me*))
(print-the-var "A:")
(binding [*eval-me* 20] ;; the first binding
  (print-the-var "B:")
  (binding [*eval-me* 30] ;; the second binding
    (print-the-var "C:"))
  (print-the-var "D:"))
(print-the-var "E:")

Running this code will print the following:

A: 10
B: 20
C: 30
D: 20
E: 10

Let’s walk through the code. First, you create a var called *eval-me* with a root binding of 10. The print-the-var function causes the A: 10 to be printed. The first binding form changes the binding to 20, causes the following B: 20 to be printed. Then the second binding kicks in, causing the C: 30. Now, as the second binding form exits, the previous binding of 20 gets restored, causing the D: 20 to be printed. When the first binding exits after that, the root binding is restored, causing the E: 10 to be printed.

We’ll contrast this behavior with the let form in the next section. In the meantime, you’ll implement a kind of aspect-oriented logging functionality for function calls. Consider the following code.

Listing 3.5. A higher-order function for aspect-oriented logging
(defn ^:dynamic twice [x]
  (println "original function")
  (* 2 x))

(defn call-twice [y]
  (twice y))

(defn with-log [function-to-call log-statement]
  (fn [& args]
    (println log-statement)
    (apply function-to-call args)))

(call-twice 10)

(binding [twice (with-log twice "Calling the twice function")]
  (call-twice 20))

(call-twice 30)

If you run this, the output will be

original function
20
Calling the twice function
original function
40
original function
60

with-log is a higher-order function that accepts another function and a log statement. It returns a new function, which when called prints the log statement to the console and then calls the original function with any arguments passed in. Note the action at a distance behavior modification of the twice function. It doesn’t even know that calls to it are now being logged to the console, and, indeed, it doesn’t need to. Any code that uses twice also can stay oblivious to this behavior modification, and as call-twice shows, everything works. Note that when the binding form exits, the original definition of twice is restored. In this way, only certain sections of code (to be more specific, certain call chains) can be modified using the binding form. We’ll use this concept of action at a distance in the mocking and stubbing framework in chapter 10 on unit testing with Clojure.

We’ll now examine one more property of bindings.

Thread-local state

As we mentioned in chapter 1, Clojure has language-level semantics for safe concurrency. It supports writing lock-free multithreaded programs. Clojure provides several ways to manage state between concurrently running parts of your programs, and vars is one of them. We’ll say a lot more about concurrency and Clojure’s support for lock-free concurrency in chapter 6. Meanwhile, we’ll look at the dynamic scope property of vars with respect to thread-local storage.

A var’s root binding is visible to all threads, unless a binding form overrides it in a particular thread. If a thread does override the root binding via a call to the binding macro, that binding isn’t visible to any other thread. Again, a thread can create nested bindings, and these bindings exist until the thread exits execution. You’ll see more interaction between binding and threads in chapter 6.

Laziness and special variables

We mentioned Clojure’s lazy sequences in chapter 1 and talked about how functions like map are lazy. This laziness can be a source of frustration when interplay with dynamic vars isn’t clearly understood. Consider the following code:

(def ^:dynamic *factor* 10)
(defn multiply [x]
  (* x *factor*))

This simple function accepts a parameter and multiplies it by the value of *factor*, which is determined by its current binding. You’ll collect a few multiplied numbers using the following:

(map multiply [1 2 3 4 5])

This returns a list containing five elements: (10 20 30 40 50). Now, you’ll use a binding call to set *factor* to 20, and repeat the map call:

(binding [*factor* 20]
  (map multiply [1 2 3 4 5]))

Strangely, this also returns (10 20 30 40 50), despite the fact that you clearly set the binding of *factor* to 20. What explains this?

The answer is that a call to map returns a lazy sequence and this sequence isn’t realized until it’s needed. Whenever that happens (in this case, as the REPL tries to print it), the execution no longer occurs inside the binding form, and so *factor* reverts to its root binding of 10. This is why you get the same answer as in the previous case. To solve this, you need to force the realization of the lazy sequence from within the binding form:

(binding [*factor* 20]
  (doall (map multiply [1 2 3 4 5])))

This returns the expected (20 40 60 80 100). This shows the need to be cautious when mixing special variables with lazy forms. doall is a Clojure function that forces realization of lazy sequences, and it’s invaluable in such situations. Of course, sometimes you don’t want to realize an entire sequence, particularly if it’s large, so you need to be careful about this. Typically, the solution is to reestablish the bindings of the variable you care about locally within the function that’s generating the elements of the sequence.

In this section, we looked at dynamic scope and the associated binding. Next, we’ll take another look at the let form you saw earlier. Because they look so similar, we’ll also explore the difference between the let and binding forms.

3.4.2. The let form revisited

We briefly explored the let form in chapter 2, where you used it to create local variables. Let’s quickly look at another example of using it:

(let [x 10
      y 20]
  (println "x, y:" x "," y))

Here, x and y are locally bound values. Locals such as these are local because the lexical block of code they’re created in limits their visibility and extent (the time during which they exist). When execution leaves the local block, they’re no longer visible and may get garbage collected.

Clojure allows functions to be defined locally, inside a lexically scoped let form. Here’s an example:

(defn upcased-names [names]
  (let [up-case (fn [name] (.toUpperCase name))]
    (map up-case names)))
;=> #'user/upcased-names
(upcased-names ["foo" "bar" "baz"])
;=> ("FOO" "BAR" "BAZ")

Here, upcased-names is a function that accepts a list of names and returns a list of the same names, all in uppercase characters. up-case is a locally defined function that accepts a single string and returns an upcased version of it. The .toUpperCase function (with a prefixed dot) is Clojure’s way of calling the toUpperCase member function on a Java object (in this case a string). You’ll learn about Java interop in chapter 5.

Now, let’s examine the difference between the structurally similar let and binding forms. To do this, you’ll first reexamine the behavior of binding via the use of *factor*, as follows:

(def ^:dynamic *factor* 10)
(binding [*factor* 20]
  (println *factor*)
  (doall (map multiply [1 2 3 4 5])))

This prints 20 and then returns (20 40 60 80 100), as expected. Now, try the same thing with a let form:

(let [*factor* 20]
  (println *factor*)
  (doall (map multiply [1 2 3 4 5])))

This prints 20 as expected but returns (10 20 30 40 50). This is because, although the let sets *factor* to 20 inside the let body, it has no effect on the dynamic scope of the *factor* var. Only the binding form can affect the dynamic scope of vars.

Now that you know how the let form works and what it can be used for, let’s look at a useful feature of Clojure that’s possible thanks to two things: lexical scope and the let form.

3.4.3. Lexical closures

Let’s begin our exploration of what a lexical closure is by understanding what a free variable is. A variable is said to be free inside a given form if there’s no binding occurrence of that variable in the lexical scope of that form. Consider the following example:

(defn create-scaler [scale]
  (fn [x]
    (* x scale)))

In this example, within the anonymous function being returned, scale doesn’t appear in any kind of binding occurrence—specifically, it’s neither a function parameter nor created in a let form. Within the anonymous function, therefore, scale is a free variable. Only lexically scoped variables can be free, and the return value of the form in which they appear depends on their value at closure creation time. Forms that enclose over free variables (such as the anonymous function shown previously) are called closures. Closures are an extremely powerful feature of languages such as Clojure—in fact, even the name Clojure is a play on the word.

How do you use a closure? Consider the following code:

(def percent-scaler (create-scaler 100))

Here, we’re binding the percent-scaler var to the function object that gets returned by the create-scaler call. This anonymous function object closes over the scale parameter and now lives on inside the percent-scaler closure. You can see this when you make a call to the percent-scaler function:

(percent-scaler 0.59)
;=> 59.0

This trivial example shows how closures are easy to create and use. A closure is an important construct in Clojure (it’s no coincidence that the name of the language sounds like the word!). It can be used for information hiding (encapsulation) because nothing from outside the closure can touch the closed-over variables. Because Clojure data structures are immutable (reducing the need to make things private), macros, closures, and multimethods allow for powerful paradigms in which to create programs. It makes traditional object-oriented ideas (a la Java or C++) feel rather confining. You’ll learn about multimethods in chapter 4 and more about macros in chapter 7. We’ll also look at closures a lot more in chapter 8, which takes a deeper look at functional programming concepts.

Now that you understand several basic structural aspects of writing Clojure code, we’ll address an organizational construct of Clojure: the namespace. Understanding how to use namespaces will aid you in writing larger programs that need to be broken into pieces for the sake of modularity and manageability.

3.5. Namespaces

When a program becomes larger than a few functions, computer languages allow the programmer to break it up into parts. An example of this facility is the package system in Java. Clojure provides the concept of namespaces for the same purpose. Programs can be broken up into parts, each being a logical collection of code—functions, vars, and the like.

Another reason why namespaces are useful is to avoid name collisions in different parts of programs. Imagine that you were writing a program that dealt with students, tests, and scores. If you were to then use an external unit-testing library that also used the word test, Clojure might complain about redefinition! Such problems can be handled by writing code in its own namespace.

3.5.1. ns macro

There’s a core var in Clojure called *ns*. This var is bound to the currently active namespace. Thus, you can influence what namespace the following code goes under by setting an appropriate value for this var. The ns macro does just this—it sets the current namespace to whatever you specify. Here’s the general syntax of the ns macro:

(ns name & references)

The name, as mentioned previously, is the name of the namespace being made current. If it doesn’t already exist, it gets created. The references that follow the name are optional and can be one or more of the following: use, require, import, load, or gen-class. You’ll see some of these in action in this section and then again in chapter 5, which covers Java interop. First, let’s look at an example of defining a namespace:

(ns org.currylogic.damages.calculators)
(defn highest-expense-during [start-date end-date]
 ;; (logic to find the answer)
)

highest-expense-during is now a function that lives in the namespace with the name of org.currylogic.damages.calculators. To use it, code outside this namespace would need to make a call (directly or indirectly) to use, require, or import (if the library is compiled into a JAR) or they can call it with its fully qualified name, which is org.currylogic.damages.calculators/highest-expense-during. We’ll explore these now through examples.

Public versus private functions

Before moving on to the next section, let’s take a quick look at private functions versus public functions. In Clojure, all functions belong to a namespace. The defn macro creates public functions, and these can be called from any namespace. To create private functions, Clojure provides the defn- macro, which works exactly the same, but such functions can only be called from within the namespace they’re defined in. defn- is itself just a shorthand for adding the metadata {:private true} to the var.

use, require

Imagine that you’re writing an HTTP service that responds to queries about a user’s expenses. Further imagine that you’re going to deal with both XML and JSON. To handle XML, you can use the Clojure-provided XML functions that live in the clojure.xml namespace.

As far as handling JSON is concerned, ideally you wouldn’t have to write code to handle the format. It turns out that the Clojure ecosystem already has a great library for this purpose called clojure.data.json.

Listing 3.6 shows what the code looks like now that you’ve selected the two libraries you need. Remember, to get it to work, you’ll need to add the dependency to the Lein configuration in project.clj by adding [org.clojure/data.json "0.2.1"] to the :dependencies list. (See appendix A for more information on setting up your Leiningen project and adding dependencies.) You’ll also need to restart the REPL for the classpath to include the new dependency.

Listing 3.6. Using external libraries by calling use
(ns org.currylogic.damages.http.expenses)
(use 'clojure.data.json)
(use 'clojure.xml)

(declare load-totals)

(defn import-transactions-xml-from-bank [url]
  (let [xml-document (parse url)]
    ;; more code here
))

(defn totals-by-day [start-date end-date]
  (let [expenses-by-day (load-totals start-date end-date)]
    (json-str expenses-by-day)))

Here, parse and json-str are functions that come from the clojure.xml and clojure.data.json libraries. The reason they’re available is that you called use on their namespaces. use takes all public functions from the namespace and includes them in the current namespace. The result is as though those functions were written in the current namespace. Although this is easy—and sometimes desirable—it often makes the code a little less understandable in terms of seeing where such functions are defined. require solves this problem, as shown in the following listing.

Listing 3.7. Using external libraries by calling require
(ns org.currylogic.damages.http.expenses)
(require '(clojure.data [json :as json-lib]))
(require '(clojure [xml :as xml-core]))

(declare load-totals)

(defn import-transactions-xml-from-bank [url]
  (let [xml-document (xml-core/parse url)]
    ;; more code here
))

(defn totals-by-day [start-date end-date]
  (let [expenses-by-day (load-totals start-date end-date)]
    (json-lib/json-str expenses-by-day)))

require makes functions available to the current namespace, as use does, but doesn’t include them the same way. They must be referred to using the full namespace name or the aliased namespace using the as clause, as shown in listing 3.7. This improves readability by making it clear where a function is actually coming from.

Finally, although these ways of using require (and use) work just fine, the idiomatic way is shown in the next listing.

Listing 3.8. Using external libraries by calling require
(ns org.currylogic.damages.http.expenses
  (:require [clojure.data.json :as json-lib]
            [clojure.xml :as xml-core]))

(declare load-totals)

(defn import-transactions-xml-from-bank [url]
  (let [xml-document (xml-core/parse url)]
    ;; more code here
))

(defn totals-by-day [start-date end-date]
  (let [expenses-by-day (load-totals start-date end-date)]
    (json-lib/json-str expenses-by-day)))

Notice how the require clauses are tucked into the namespace declaration. Similar approaches can be used with :use and :import. In general, prefer require, because it avoids the aliasing problems that can occur with use. For instance, if a library you use suddenly were to introduce a function named the same as one in your namespace, it would break your code. The use of require avoids this problem while also making it abundantly clear where each required function comes from.

Before moving on, let’s look at an aid for when you’re working with namespaces at the REPL.

reload and reload-all

As described in chapter 1, typical programming workflow in Clojure involves building up functions in an incremental fashion. As functions are written or edited, the namespaces that they belong to often need to be reloaded in the REPL. You can do this by using the following:

(use 'org.currylogic.damages.http.expenses :reload)
(require '(org.currylogic.damages.http [expenses :as exp]) :reload)

:reload can be replaced with :reload-all to reload all libraries that are used either directly or indirectly by the specified library. By the way, these functions are useful during development time, particularly when working with the REPL. When the program is deployed to run, the namespaces are all loaded during compile time, which happens only once.

Before wrapping up this section on namespaces, we’ll explore some options that Clojure provides to work with them programmatically.

3.5.2. Working with namespaces

Apart from the convenience offered by namespaces in helping keep code modular (and guarding from name collisions), Clojure namespaces can be accessed programmatically. In this section, we’ll review a few useful functions to do this.

create-ns and in-ns

create-ns is a function that accepts a symbol and creates a namespace named by it if it doesn’t already exist. in-ns is a function that accepts a single symbol as an argument and switches the current namespace to the one named by it. If it doesn’t exist, it’s created.

all-ns and find-ns

The no-argument function all-ns returns a list of all namespaces currently loaded. The find-ns function accepts a single symbol as an argument (no wildcards) and checks to see if it names a namespace. If so, it returns true, else nil.

ns-interns and ns-publics

ns-interns is a function that accepts a single argument, a symbol that names a namespace, and returns a map containing symbols to var mappings from the specified namespace. ns-publics is similar to ns-interns but instead of returning a map that contains information about all vars in the namespace, it returns only the public ones.

ns-resolve and resolve

ns-resolve is a function that accepts two arguments: a symbol naming a namespace and another symbol. If the second argument can be resolved to either a var or a Java class in the specified namespace, the var or class is returned. If it can’t be resolved, the function returns nil. resolve is a convenience function that accepts a single symbol as its argument and tries to resolve it (such as ns-resolve) in the current namespace.

ns-unmap and remove-ns

ns-unmap accepts a symbol naming a namespace and another symbol. The mapping for the specified symbol is removed from the specified namespace. remove-ns accepts a symbol naming a namespace and removes it entirely. This doesn’t work for the clojure.core namespace.

These are some of the functions provided by Clojure to programmatically work with namespaces. They’re useful in controlling the environment in which certain code executes. An example of this will appear in chapter 11, on domain-specific languages.

So far, you’ve seen a lot of the basics of Clojure and should now be in a position to read and write programs of reasonable complexity. In the next section, you’ll see a feature that isn’t found in languages such as Java and Ruby, namely, destructuring.

3.6. Destructuring

Several programming languages provide a feature called pattern matching, which is a form of function overloading based on structural patterns of arguments (as opposed to their number or types). Clojure has a somewhat less general form of pattern matching called destructuring. In Clojure, destructuring lets programmers bind names to only those parts of certain data structures that they care about. To see how this works, look at the following code, which doesn’t use destructuring:

(defn describe-salary [person]
  (let [first  (:first-name person)
        last   (:last-name  person)
        annual (:salary     person)]
    (println first last "earns" annual)))

Here, the let form doesn’t do much useful work—it sets up local names for parts of the incoming person sequence. By using Clojure’s destructuring capabilities, such code clutter can be eliminated:

(defn describe-salary-2 [{first  :first-name
                          last   :last-name
                          annual :salary}]
  (println first last "earns" annual))

Here, the incoming sequence (in this case a map) is destructured, and useful parts of it are bound to names within the function’s parameter-binding form. In fact, extracting values of certain keys from inside maps is so common that Clojure provides an even more convenient way of doing this. You’ll see that and more ways to destructure maps in section 3.6.2, “Map bindings,” but before that let’s examine destructuring vectors.

3.6.1. Vector bindings

Vector destructuring supports any data structure that implements the nth function, including vectors, lists, seqs, arrays, and strings. This form of destructuring consists of a vector of names, each of which is assigned to the respective elements of the expression, looked up via the nth function. An example will make this clear:

(defn print-amounts [[amount-1 amount-2]]
  (println "amounts are:" amount-1 "and" amount-2))
(print-amounts [10.95 31.45])
amounts are: 10.95 and 31.45

This implementation of print-amounts is short and clear: you can read the parameter list and see that the single argument will be broken into two parts named amount-1 and amount-2. The alternative is to use a let form inside the function body to set up amount-1 and amount-2 by binding them to the first and last values of the incoming vector.

There are several options when it comes to using vector bindings. Imagine that the function print-amounts takes a vector that could contain two or more amounts (instead of only two in this contrived example). The following shows how you could deal with that situation.

Using & and :as

Consider the following example of destructuring:

(defn print-amounts-multiple [[amount-1 amount-2 & remaining]]
  (println "Amounts are:" amount-1 "," amount-2 "and" remaining))

If you make the following call

(print-amounts-multiple [10.95 31.45 22.36 2.95])

Clojure will print the following:

Amounts are: 10.95 , 31.45 and (22.36 2.95)

As shown here, the name following the & symbol gets bound to a sequence containing all the remaining elements from the sequence being destructured.

Another useful option is the :as keyword. Here’s another example:

(defn print-all-amounts [[amount-1 amount-2 & remaining :as all]]
  (println "Amounts are:" amount-1 "," amount-2 "and" remaining)
  (println "Also, all the amounts are:" all))

When you call this function as follows

(print-all-amounts [10.95 31.45 22.36 2.95])

it results in the following being printed to the console:

Amounts are: 10.95 , 31.45 and (22.36 2.95)
Also, all the amounts are: [10.95 31.45 22.36 2.95]

Notice that all, which was introduced via the :as destructuring option, was bound to the complete argument that was passed in.

Destructuring vectors makes it easy to deal with the data inside them. What’s more, Clojure allows nesting of vectors in destructuring bindings.

Nested vectors

Suppose you had a vector of vectors. Each inner vector was a pair of data—the first being a category of expense and the second the amount. If you wanted to print the category of the first expense amount, you could do the following:

(defn print-first-category [[[category amount] & _ ]]
  (println "First category was:" category)
  (println "First amount was:" amount))

Running this with an example, such as

(def expenses [[:books 49.95] [:coffee 4.95] [:caltrain 2.25]])
(print-first-category expenses)

results in Clojure printing the following:

First category was: :books
First amount was: 49.95

Note that in the argument list of print-first-category, & is used to ignore the remaining elements of the vector that you don’t care about. One thing to remember is that destructuring can take place in any binding form, including function parameter lists and let forms. Another thing to remember is that vector destructuring works for any data type that supports the nth and nthnext functions. On a practical level, for instance, if you were to implement the ISeq interface and create your own sequence data type, you’d be able to natively use not only all of Clojure’s core functions but also such destructuring.

Before closing out this section on destructuring, let’s look at another useful form of destructuring binds—the one that uses maps.

3.6.2. Map bindings

You saw how convenient it is to destructure vectors into relevant pieces and bind only those instead of the whole vector. Clojure supports similar destructuring of maps. To be specific, Clojure supports destructuring of any associative data structure, which includes maps, strings, vectors, and arrays. Maps, as you know, can have any key, whereas strings, vectors, and arrays have integer keys. The destructuring binding form looks similar to ones you saw earlier; it’s a map of key-expression pairs, where each key name is bound to the value of the respective initialization expression.

Take a look at the example from earlier in the chapter:

(defn describe-salary-2 [{first  :first-name
                          last   :last-name
                          annual :salary}]
   (println first last "earns" annual))

As noted earlier, first, last, and annual get bound to the respective values from the map passed to describe-salary-2. Now suppose that you also want to bind a bonus percentage, which may or may not exist. Clojure provides a convenient option in mapdestructuring bindings to handle such optional values, using the :or keyword:

(defn describe-salary-3 [{first  :first-name
                          last   :last-name
                          annual :salary
                          bonus  :bonus-percentage
                          :or {bonus 5}}]
  (println first last "earns" annual "with a" bonus "percent bonus"))

When called with arguments that contain all keys that are being destructured, it works similarly to the previous case:

(def a-user {:first-name       "pascal"
             :last-name        "dylan"
             :salary           85000
             :bonus-percentage 20})
(describe-salary-3 a-user)

This prints the following to the console:

pascal dylan earns 85000 with a 20 percent bonus

Here’s how it works if you call the function with an argument that doesn’t contain a bonus:

(def another-user {:first-name "basic"
                   :last-name  "groovy"
                   :salary     70000})
(describe-salary-3 another-user)

This binds bonus to the default value specified via the :or option. The output is

basic groovy earns 70000 with a 5 percent bonus

Finally, similar to the case of vectors, map bindings can use the :as option to bind the complete hash map to a name. Here’s an example:

(defn describe-person [{first  :first-name
                        last   :last-name
                        bonus  :bonus-percentage
                        :or {bonus 5}
                        :as p}]
  (println "Info about" first last "is:" p)
  (println "Bonus is:" bonus "percent"))

An example of using this function is

(def third-user {:first-name "lambda"
                :last-name   "curry"
                :salary      95000})
(describe-person third-user)

This causes the following to be echoed on the console:

Info about lambda curry is: {:first-name lambda,
                             :last-name curry,
                             :salary 95000}
Bonus is: 5 percent

This is all quite convenient and results in short, readable code. Clojure provides a couple of options that make it even more easy to destructure maps: the :keys, :strs, and :syms keywords. Here’s how to use :keys by writing a small function to greet your users:

(defn greet-user [{:keys [first-name last-name]}]
   (println "Welcome," first-name last-name))

When you run this, first-name and last-name get bound to values of :first-name and :last-name from inside the argument map. You can try it:

(def roger {:first-name "roger" :last-name "mann" :salary 65000})
(greet-user roger)

The output looks like this:

Welcome, roger mann

If your keys were strings or symbols (instead of keywords as in these examples), you’d use :strs or :syms. Incidentally, we’ll use map destructuring in chapter 7 to add keyword arguments to the Clojure language.

We covered various ways that Clojure supports destructuring large, complex data structures into their components. This is a useful feature because it results in code that’s shorter and clearer. It improves the self-documentation nature of well-written code, because the destructuring binding tells the reader exactly what parts of the incoming data structure are going to be used in the code that follows.

3.7. Reader literals

In programming languages, a literal is the source code representation of a fixed value. For instance, the string "clojure" is a literal value, as is the number 42. Although most languages have data literal support for strings and numbers, a few go further. For instance, many have literal support for vectors and maps, expressed as something like [1 2 3 4] and {:a 1 :b 2}.

Some programming languages support many more varieties of literals, but almost no language lets you, the programmer, add more. Clojure lets you do this to a degree, through its reader literals.

You’ve seen reader macros already. Reader literals are a way to let the reader construct a specific data type for you, as defined by your data reader functions. For instance, imagine that you were using UUIDs in your program. The easiest way to construct them is via the java.util.UUID/randomUUID method:

(java.util.UUID/randomUUID)
;=> #uuid "197805ed-7aa2-4ff8-ae66-b94a838df2a8"

Imagine now in a test you want to control what UUIDs are generated. Here’s a function that accepts the first eight characters of a UUID and cans the rest:

(ns clj-in-act.ch3.reader
  (:import java.util.UUID))


(defn guid [four-letters-four-digits]
  (java.util.UUID/fromString
    (str four-letters-four-digits "-1000-413f-8a7a-f11c6a9c4036")))

Now, instead of struggling with random UUIDs over which you have no naming control, you can create (or re-create) known UUIDs at will:

(use 'clj-in-act.ch3.reader)
;=> nil
(guid "abcd1234")
;=> #uuid "abcd1234-1000-413f-8a7a-f11c6a9c4036"

You can imagine such known values for universally unique identifiers (UUIDs) being very useful when writing tests that deal with UUIDs. But calling the guid function each time is quite noisy. It would be nice if you could clean that up. Enter Clojure’s reader literals.

You’d create a file called data_readers.clj and put it in the root directory of the classpath. Because you’re using Leiningen, this would be the src folder in your project directory. In it could be a map containing your reader literal syntax and the respective functions that handle them. For instance, the following would be the contents of data_readers.clj with just one entry:

{G clj-in-act.ch3.reader/guid}

Now, whenever you wanted a known UUID, you could just do this:

#G "abcd1234"
;=> #uuid "abcd1234-1000-413f-8a7a-f11c6a9c4036"

This makes for far cleaner code, especially when many UUIDs are being created. Please take notice of the # before the G to form #G. Note that although this shows how reader literals work, it’s not okay to use unqualified syntax for your reader literals. In other words, G by itself is unqualified by any namespace, and so it could conflict with someone else’s, were they to name theirs the same. For this reason, the following namespaced version is preferred:

{clj-in-act/G clj-in-act.ch3.reader/guid}

And you’d use it like this:

#clj-in-act/G "abcd1234"
;=> #uuid "abcd1234-1000-413f-8a7a-f11c6a9c4036"

By namespacing your reader literals, you can be sure that they won’t ever collide with anyone else’s. Reader literals should be used sparingly, but they make it particularly easy to work with data types, and they can enhance the readability of your code.

3.8. Summary

This was another long chapter! We explored a few important aspects of writing code in Clojure. We started off the chapter looking at metadata and exception handling. Then we looked at functions, which form the basis of the language, and you saw several ways to create and compose functions. We also examined scope—both lexical and dynamic—and how it works in Clojure. We then showed that once programs start getting large you can use namespaces to break up and organize them. Finally, we looked at the destructuring capabilities of Clojure—a feature that comes in handy when writing functions or let forms. We closed off the chapter by looking at Clojure’s reader literals.

Between the previous chapter and this one, we covered most of the basic features of the language. You can write fairly nontrivial programs with what you’ve learned so far. The next few chapters focus on a few features that are unique to Clojure—things like Java interoperability, concurrency support, multimethods, the macro system, and more.

In the next chapter, you’ll learn about multimethods. You’ll see how inheritance-based polymorphism is an extremely limited way of achieving polymorphic behavior and how multimethods are an open-ended system to create your own version of polymorphism that could be specific to your problem domain. By combining ideas from the next chapter with those we explored in this one, such as higher-order functions, lexical closures, and destructuring, you can create some rather powerful abstractions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset