Using external C/C++ libraries from Rust

Given the amount of software written over the last three decades, a lot of system software is written in C/C++. It's more likely that you may want to link to an existing library written in C/C++ for use in Rust, as rewriting everything in Rust (though desirable) is not practical for complex projects. But at the same time, writing manual FFI bindings for these libraries is also painful and error-prone. Fortunately, there are tools for us to automatically generate bindings to C/C++ libraries. For this demo, the required code on the Rust side is much simpler than the previous example of calling C/C++ code from Rust, as, this time, we'll use a neat crate called bindgen that automatically generates FFI bindings from C/C++ libraries. Bindgen is the recommended tool if someone wants to integrate a complex library with lots of APIs. Writing these bindings manually can be very error-prone and bindgen helps us by automating this process. We'll use this crate to generate bindings for a simple C library, levenshtein.c, which can be found at https://github.com/wooorm/levenshtein.c, which is used to find the minimum edit distance between two strings. The edit distance is used in a wide variety of applications, such as in fuzzy string matching, natural language processing, and in spell checkers. Anyway, let's create our cargo project by running cargo new edit_distance --lib.

Before we use bindgen, we need to install a few dependencies as bindgen needs them:

$ apt-get install llvm-3.9-dev libclang-3.9-dev clang-3.9

Next, in our Cargo.toml file, we'll add a build dependency on bindgen and the cc crate:

# edit_distance/Cargo.toml

[build-dependencies]
bindgen = "0.43.0"
cc = "1.0"

The bindgen crate will be used to generate bindings from the levenshtein.h header file, while the cc crate will be used to compile our library as a shared object so that we can use it from Rust. Our library-related files reside in the lib folder at the crate root.

Next, we'll create our build.rs file, which will be run before any of our source files are compiled. It will do two things: first, it will compile levenshtein.c to a shared object (.so) file, and second, it will generate bindings to the APIs defined in the levenshtein.h file:

// edit_distance/build.rs

use std::path::PathBuf;

fn main() {
println!("cargo:rustc-rerun-if-changed=.");
println!("cargo:rustc-link-search=.");
println!("cargo:rustc-link-lib=levenshtein");

cc::Build::new()
.file("lib/levenshtein.c")
.out_dir(".")
.compile("levenshtein.so");

let bindings = bindgen::Builder::default()
.header("lib/levenshtein.h")
.generate()
.expect("Unable to generate bindings");

let out_path = PathBuf::from("./src/");
bindings.write_to_file(out_path.join("bindings.rs")).expect("Couldn't write bindings!");
}

In the preceding code, we tell Cargo that our library search path is our current directory and that the library we are linking against is called levenshtein. We also tell Cargo to rerun code in build.rs if any of our files in our current directory change:

println!("cargo:rustc-rerun-if-changed=.");
println!("cargo:rustc-link-search=.");
println!("cargo:rustc-link-lib=levenshtein");

Following that, we create a compilation pipeline for our library by creating a new Build instance and provide the appropriate C source file for the file method. We also set the output directory to out_dir and our library name to the compile method:

cc::Build::new().file("lib/levenshtein.c")
.out_dir(".")
.compile("levenshtein");

Next, we create a bindgen Builder instance, pass our header file location, call generate(), and then write it to a bindings.rs file before calling write_to_file:

let bindings = bindgen::Builder::default().header("lib/levenshtein.h")
.generate()
.expect("Unable to generate bindings");

Now, when we run cargo build, a bindings.rs file will be generated under src/. As we mentioned previously, it's good practice for all libraries that are exposing FFI bindings to provide a safe wrapper. So, under src/lib.rs, we'll create a function named levenshtein_safe that wraps the unsafe function from bindings.rs:

// edit_distance/src/lib.rs

mod bindings;

use crate::bindings::levenshtein;
use std::ffi::CString;

pub fn levenshtein_safe(a: &str, b: &str) -> u32 {
let a = CString::new(a).unwrap();
let b = CString::new(b).unwrap();
let distance = unsafe { levenshtein(a.as_ptr(), b.as_ptr()) };
distance
}

We import the unsafe function from bindings.rs, wrap it within our levenshtein_safe function, and call our levenshtein function in an unsafe block, passing C-compatible strings. It's time to test our levenshtein_safe function. We'll create a basic.rs file in an examples/ directory in our crate root, which has the following code:

// edit_distance/examples/basic.rs

use edit_distance::levenshtein_safe;

fn main() {
let a = "foo";
let b = "fooo";
assert_eq!(1, levenshtein_safe(a, b));
}

We can run this with cargo run --example basic and we should see no assertion failures as the value should be 1 from the levenshtein_safe call. Now, it's a recommended naming convention for these kind of crates to have the suffix sys appended to them, which only houses FFI bindings. Most crates on crates.io follow this convention. This was a whirlwind tour on how to use bindgen to automate cross-language interaction. If you want similar automation for reverse FFI bindings, such as Rust in C, there is also an equivalent project called cbindgen at https://github.com/eqrion/cbindgen, which can generate C header files for Rust crates. For instance, Webrender uses this crate to expose its APIs to other languages. Given the legacy of C, it's the lingua franca of programming languages and Rust has first-class support for it. A lot of other languages also call into C. This implies that your Rust code can be called from all other languages that target C. Let's make other languages talk to Rust.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset