Benchmarking on stable Rust

The built-in benchmarking framework provided by Rust is unstable, but fortunately there are community developed benchmarking crates that work on stable Rust. One such popular crate that we'll explore here is criterion-rs. This crate is designed to be easy to use while at the same time providing detailed information on the benchmarked code. It also maintains the state of the last run, reporting performance regressions (if any) on every run. Criterion.rs generates more statistical reports than the built-in benchmark framework, and also generates helpful charts and graphs using gnuplot to make it understandable to the user.

To demonstrate using this crate, we'll create a new crate called cargo new criterion_demo --lib. We will add the criterion crate to Cargo.toml as a dependency under the dev-dependencies section:

[dev-dependencies]
criterion = "0.1"

[[bench]]
name = "fibonacci"
harness = false

We have also added a new section known as [[bench]], which indicates to cargo that we have a new benchmark test named fibonacci and that it does not use the built-in benchmark harness (harness = false), since we are using the criterion crate's test harness.

Now, in src/lib.rs, we have a fast and a slow version of a function that computes the nth fibonacci number (with initial values of n0 = 0 and n1 = 1):

// criterion_demo/src/lib.rs

pub fn slow_fibonacci(nth: usize) -> u64 {
if nth <= 1 {
return nth as u64;
} else {
return slow_fibonacci(nth - 1) + slow_fibonacci(nth - 2);
}
}

pub fn fast_fibonacci(nth: usize) -> u64 {
let mut a = 0;
let mut b = 1;
let mut c = 0;
for _ in 1..nth {
c = a + b;
a = b;
b = c;
}
c
}

fast_fibonacci is the bottom-up iterative solution to get the nth fibonacci number, whereas the slow_fibonacci version is the slow recursive version. Now, criterion-rs requires us to place our benchmarks inside a benches/ directory, which we created at the crate root. Within the benches/ directory, we have also created a file named fibonacci.rs, which matches our name under the [[bench]] in Cargo.toml. It has the following content:

// criterion_demo/benches/fibonacci.rs

#[macro_use]
extern crate criterion;
extern crate criterion_demo;

use criterion_demo::{fast_fibonacci, slow_fibonacci};
use criterion::Criterion;

fn fibonacci_benchmark(c: &mut Criterion) {
c.bench_function("fibonacci 8", |b| b.iter(|| slow_fibonacci(8)));
}

criterion_group!(fib_bench, fibonacci_benchmark);
criterion_main!(fib_bench);

There's quite a lot going on here! In the preceding code, we first declare our required crates and import our the fibonacci functions that we need to benchmark (fast_fibonacci and slow_fibonacci). Also, there is a #[macro_use] attribute above extern crate criterion, which means to use any macros from a crate, we need to opt for it using this attribute as they are not exposed by default. It's similar to a use statement, which is used to expose module items.

Now, criterion has this notion of benchmark groups that can hold related benchmark code. Accordingly, we created a function named fibonacci_benchmark, which we then pass on to the criterion_group! macro. This assigns a name of fib_bench to this benchmark group. The fibonacci_benchmark function takes in a mutable reference to a criterion object, which holds the state of our benchmark runs. This exposes a method called bench_function, which we use to pass in our benchmark code to run in a closure with a given name (above fibonacci 8). Then, we need to create the main benchmark harness, which generates code with a main function to run all of it by using criterion_main!, before passing in our benchmark group,  fib_bench. Now, it's time to run cargo bench with the first slow_fibonacci function inside the closure. We get the following output:

We can see that the recursive version of our fibonacci function takes about 106.95 ns to run on average. Now, within the same benchmark closure, if we replace our slow_fibonacci with our fast_fibonacci and run cargo bench again, we'll get the following output:

Great! The fast_fibonacci version takes just 7.8460 ns to run on average. That's obvious, but the great thing about this is the detailed benchmark report, which also shows a human-friendly message: Performace has improved. The reason criterion is able to show this regression report is that it maintains the previous state of benchmark runs and uses their history to report changes in performance.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset