© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
C. MilanesiBeginning Rusthttps://doi.org/10.1007/978-1-4842-7208-4_13

13. Defining Closures

Carlo Milanesi1  
(1)
Bergamo, Italy
 
In this chapter, you will learn:
  • Why there is a need for anonymous inline functions, with type inference for the arguments and the return value type, without having to write braces, and which can access the variables that are alive at the function definition point

  • How such lightweight functions, named “closures,” can be declared and invoked

The Need for Disposable Functions

The Rust way to sort an array in ascending order is this:
let mut arr = [4, 8, 1, 10, 0, 45, 12, 7];
arr.sort();
print!("{:?}", arr);

This will print: [0, 1, 4, 7, 8, 10, 12, 45].

But if you want to sort it in descending order, or use some other criterion, there are no prepackaged functions; you must invoke the sort_by function, passing to it a reference to a comparison function. Such comparison function receives two items, and it returns an indication of which of them must precede the other one:
let mut arr = [4, 8, 1, 10, 0, 45, 12, 7];
use std::cmp::Ordering;
fn desc(a: &i32, b: &i32) -> Ordering {
    if a < b { Ordering::Greater }
    else if a > b { Ordering::Less }
    else { Ordering::Equal }
}
arr.sort_by(desc);
print!("{:?}", arr);

This will print: [45, 12, 10, 8, 7, 4, 1, 0].

The desc function returns a value whose type is defined by the standard library in the following way:
enum Ordering { Less, Equal, Greater }

This works, but it has several drawbacks.

First, the desc function is defined only to be used in one point, in the statement that follows it. Usually you don’t create a very small function to be used only in one point; rather, the body of that function is expanded where it is needed. However, the library function sort_by requires a function. What is needed is an inline anonymous function, which is a function that is declared in the same point where it is used.

Moreover , while the type specification is optional for variable declarations, it is required for the arguments and the return value for function declarations. These specifications, like the name of the function, are convenient when such function is invoked from faraway statements, and possibly by other programmers. But when you have to write a function to be invoked just where it has been declared, such type specifications are mostly annoying. So it would be a convenient feature to declare and invoke inline an anonymous function with type inference of its arguments and its return value.

Another drawback regards the need to enclose the function body in braces. Usually, functions contain several statements, so it is not at all annoying having to enclose such statements in braces. But anonymous functions often consist of a single expression, which would be convenient to be able to write without having to enclose it in braces. Given that a block is still an expression, it would be nice to have an inline anonymous function with type inference and a single expression as body.

Capturing the Environment

All that’s been said so far in this chapter is also valid in many other languages, C included. But Rust functions have an additional unusual limitation: they cannot access any variable that is declared outside of them. You can access static items, and also const items, but you cannot access stack-allocated (that is “let-declared”) variables. For example, this program is illegal:
let two = 2.;
fn print_double(x: f64) {
    print!("{}", x * two);
}
print_double(17.2);

Its compilation generates the error: can't capture dynamic environment in a fn item. By dynamic environment , it means the set of the variables that happen to be valid where the function is defined. For instance, in the previous example, the variable two is still defined when the print_double function is defined. The ability to access the variables of that dynamic environment is named capturing the environment .

Instead, this is valid:
const TWO: f64 = 2.;
fn print_double(x: f64) {
    print!("{}", x * TWO);
}
print_double(17.2);
and this is too:
static TWO: f64 = 2.;
fn print_double(x: f64) {
    print!("{}", x * TWO);
}
print_double(17.2);

This code is valid because const items and static items can be captured by functions, but variables cannot.

Such a limitation has a good reason: const items and static items have fixed values, while the value of variables can change. So, if a function cannot access external variables, its behavior will change only if the value of its arguments changes. The ability to access external variables effectively introduces them into the programming interface of the function; though, they are not apparent from the function signature, so they are misleading for understanding code.

But when a function can be invoked only where it has been defined, the fact that it accesses external variables does not cause it to be less understandable, because those external variables are available anyway where the function is invoked.

Therefore, the requirements of our desired feature are the following: an inline anonymous function, with type inference; a single expression as body; and the capture of any valid variable.

Closures

Because of its great usefulness, in Rust there actually is such a feature, which is named closure.

A closure is just a handier kind of function, fit to define small anonymous functions, and to invoke them just where they have been defined.

In fact, you can also define a closure; assign it to a variable, so giving it a name; and then invoke it later using its name. This is not the most typical usage of closures, though. Even type annotation is possible.

Here is the earlier descending order sorting example, replacing the desc function with a closure, and keeping the code as similar to the previous one as possible:
let mut arr = [4, 8, 1, 10, 0, 45, 12, 7];
use std::cmp::Ordering;
let desc = |a: &i32, b: &i32| -> Ordering {
    if a < b { Ordering::Greater }
    else if a > b { Ordering::Less }
    else { Ordering::Equal }
};
arr.sort_by(desc);
print!("{:?}", arr);
The only differences with respect to the previous example are:
  • The let keyword was used instead of fn.

  • The = symbol has been added after the name of the closure.

  • The ( and ) symbols, which enclose the function arguments, have been replaced by the | (pipe) symbol.

  • A semicolon has been added after the closure declaration.

So far, there are no advantages, but we saw that the closure can be defined where it has to be used, and that the types and the braces are optional. Therefore, the previous code can be transformed into this one:
let mut arr = [4, 8, 1, 10, 0, 45, 12, 7];
use std::cmp::Ordering;
arr.sort_by(|a, b|
    if a < b { Ordering::Greater }
    else if a > b { Ordering::Less }
    else { Ordering::Equal });
print!("{:?}", arr);

This is already a nice simplification. But there is more.

The standard library already contains the cmp function (shorthand for compare); this function returns an Ordering value according to which of its two arguments is greater. The two following statements are equivalent:
arr.sort();
arr.sort_by(|a, b| a.cmp(b));
Therefore, to obtain the inverted order, you can use, indifferently, each one of the following statements:
arr.sort_by(|a, b| (&-a).cmp(&-b));
arr.sort_by(|a, b| b.cmp(a));
Here is a complete example:
let mut arr = [4, 8, 1, 10, 0, 45, 12, 7];
arr.sort_by(|a, b| b.cmp(a));
print!("{:?}", arr);

Also the use directive has been removed, because it is not required anymore.

The other way to write it is this:
    let mut arr = [4, 8, 1, 10, 0, 45, 12, 7];
    arr.sort_by(|a, b| (-*a).cmp(&-*b));
    print!("{:?}", arr);

Let’s explain the closure passed to the sort_by function . This closure receives two references; the type of both a and b is &i32. To apply the minus sign, we must dereference such references, to get numbers; so we write *a and *b. The cmp function expects a reference as argument, so we add the & symbol. The expression -*a could be written &-*a just as well, but in this case the reference symbol is implicit.

Rust closures are very efficient because they do not allocate heap memory, and because they are usually expanded inline by the compiler, thereby eliminating all temporary objects. They are actually quite similar to C++ lambdas. The C++ program corresponding to the previous Rust program is this:
#include <array>
#include <algorithm>
#include <iostream>
#include <iterator>
using namespace std;
int main() {
    auto arr = array<int, 8> {
        4, 8, 1, 10, 0, 45, 12, 7 };
    stable_sort(
        arr.begin(), arr.end(),
        [](int a, int b) { return b < a; });
    copy(
      arr.begin(),
      arr.end(),
      ostream_iterator<int>(cout, ", "));
}

Closure Invocation Syntax

Here is another example that shows six ways to invoke a closure:
let factor = 2;
let multiply = |a| a * factor;
print!("{}", multiply(13));
let multiply_ref = &multiply;
print!(
    " {} {} {} {} {}",
    (*multiply_ref)(13),
    multiply_ref(13),
    (|a| a * factor)(13),
    (|a: i32| a * factor)(13),
    |a| -> i32 { a * factor }(13));

This will print: 26 26 26 26 26 26.

This program contains six identical closure invocations. Each of them performs these steps:
  • It takes an i32 argument named a.

  • It multiplies the argument by the captured variable factor, whose value is 2.

  • It returns the result of such multiplication.

The argument is always 13, so the result is always 26.

In the second line, the first closure is declared, using a type inference both for the argument a and for the return value. The body of the closure accesses the external variable factor, declared by the previous statement, so such variable is captured inside the closure with its current value. The closure is then used to initialize the variable multiply, whose type is inferred.

In the third line, the closure assigned to the multiply variable is invoked. This shows that closures are invoked just like any function.

In the fourth line, the address of the just declared closure is used to initialize the multiply_ref variable, whose type is that of a reference to a closure.

In the seventh line, the reference to a closure is dereferenced, obtaining a closure, and that closure is invoked.

In the eighth line, that closure is invoked without dereferencing the reference, as the dereference operation is implicit for a function invocation.

In the last three statements, three anonymous closures are declared and invoked. The first one infers both the type of the argument and the type of the return value; the second one specifies the type of the argument and infers the type of the return value; and the third one infers the type of the argument and specifies the type of the return value.

Notice that the argument 13 is passed to the closure always enclosed in parentheses. To avoid confusing that expression (13) with the expression preceding it, which specifies the closure, in some cases such closure expression must be enclosed in parentheses too. In the last case, instead, the body of the closure had to be enclosed in braces, to separate it from the return value type specification.

The braces are also required when the closure contains several statements, like in this case:
print!(
    "{}",
    (|v: &Vec<i32>| {
        let mut sum = 0;
        for i in 0..v.len() {
            sum += v[i];
        }
        sum
    })(&vec![11, 22, 34]));

This will print 67, which is the sum of the numbers contained in the vector.

In this case, it was necessary to specify the type of the argument, as otherwise the compiler couldn’t infer it, and it would emit the error message “type annotations needed.”

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset