Most problems that are good targets for parallelization involve doing calculations in tight loops. These places are good for all kinds of performance optimizations, from hoisting conditionals out of them to fine-tuning compiler hints, which we will do here.
Being a dynamic language, Clojure doesn't require type declarations. However, if we know what types we are using, we can get better performance by including type hints in our code. This is helpful for object types, where Clojure can then resolve method calls at compile time, and also for primitive types, where Clojure can generate well-tuned code that doesn't include boxing or wrapping the primitive type in a heavier Java object.
For more information about type hints and working with Java primitives, see the documentation on interacting with Java from Clojure at http://clojure.org/java_interop.
Type hints are expressed as metadata tags for return types and object types. We'll see examples of both in this recipe. For this recipe, we'll revisit the Monte Carlo simulation to estimate the value of pi. We first saw this in the Paritioning Monte Carlo simulations for better pmap performance recipe. This example has enough low-level math which makes it a good candidate for this kind of optimization.
We'll use the Criterium library for benchmarking, so include these in our Leiningen project.clj
file:
(defproject parallel-data ""0.1.0"" :dependencies [[org.clojure/clojure ""1.6.0""] [criterium ""0.4.3""]])
Use it and the java.lang.Math
class in our script or REPL:
(use 'criterium.core) (import [java.lang Math])
From the Partitioning Monte Carlo simulations for better pmap performance recipe, we'll use the rand-point
and center-dist
functions.
We'll implement the Monte Carlo simulation without type hints, benchmark it, and then reimplement it with type hints to see whether we can get better performance.
n
random points in the unit square and tests how many of them fall within the upper-right quadrant of a circle centered at 0, 0. The ratio of those that fall within the circle, multiplied by 4, should approximate the value of pi:(defn mc-pi [n] (let [in-circle (->> (repeatedly n rand-point) (map center-dist) (filter #(<= % 1.0)) count)] (* 4.0 (/ in-circle n))))
user=> (bench (mc-pi 100000))
WARNING: Final GC required 1.476534955514225 % of runtime
Evaluation count : 1320 in 60 samples of 22 calls.
Execution time mean : 45.774244 ms
Execution time std-deviation : 621.709776 µs
Execution time lower quantile : 44.878862 ms ( 2.5%)
Execution time upper quantile : 46.787552 ms (97.5%)
Overhead used : 1.780493 ns
Found 3 outliers in 60 samples (5.0000 %)
low-severe 2 (3.3333 %)
low-mild 1 (1.6667 %)
Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
nil
center-dist
and add type hint metadata for the return type too:(defn center-dist-hint
(^double [[x y]]
(Math/sqrt (+ (Math/pow (double x) (double 2.0))
(Math/pow (double y)
(double 2.0))))))
mc-pi
to call center-dist-hint
and add a type hint to it:(defn mc-pi-hint
(^double [n]
(let [in-circle (double
(->>
(repeatedly n rand-point)
(map center-dist-hint)
(filter #(<= % (double 1.0)))
count))]
(double
(* (double 4.0) (/ in-circle (double n)))))))
Finally, let's see if this helps:
user=> (bench (mc-pi-hint 100000))
Evaluation count : 1380 in 60 samples of 23 calls.
Execution time mean : 46.317966 ms
Execution time std-deviation : 1.237294 ms
Execution time lower quantile : 44.981042 ms ( 2.5%)
Execution time upper quantile : 49.716563 ms (97.5%)
Overhead used : 1.780493 ns
Found 5 outliers in 60 samples (8.3333 %)
low-severe 2 (3.3333 %)
low-mild 3 (5.0000 %)
Variance from outliers : 14.1789 % Variance is moderately inflated by outliers
nil
In this case, it resulted in a slight slowdown. However, I have seen simple changes such as this result in good performance improvements for a minimal amount of work.
The first type hint we used is the ^double
, which indicates the functions' return types and parameters. These allow the Clojure compiler to generate a bytecode that's well-optimized for these return types and for working with the Java primitives.
The other use of type hints allow the Clojure compiler to resolve methods at compile time. The clojure.string
namespace has a good example of this:
user=> (require '[clojure.string :as string]) user=> (source string/trim) (defn ^String trim ""Removes whitespace from both ends of string."" {:added ""1.2""} [^CharSequence s] (.. s toString trim)) nil
We can see that the parameter to the string/trim
function is a CharSequence
, and the compiler can use this information to know which toString
to dispatch to.