Calling C code from Rust

First, we'll take a look at an example of calling C code from Rust. We'll create a new binary crate from which we'll call our C function that's defined in a separate C file. Let's create a new project by running cargo new c_from_rust. Within the directory, we'll also add our C source, that is, the mystrlen.c file, which has the following code inside it:

// c_from_rust/mystrlen.c

unsigned int mystrlen(char *str) {
unsigned int c;
for (c = 0; *str != ''; c++, *str++);
return c;
}

It contains a simple function, mystrlen, which returns the length of a string passed to it. We want to invoke mystrlen from Rust. To do that, we'll need to compile this C source into a static library. There's one more example in the upcoming section, where we cover linking dynamically to a shared library. We'll use the cc crate as a build dependency in our Cargo.toml file:

# c_from_rust/Cargo.toml

[build-dependencies]
cc = "1.0"

The cc crate does all the heavy lifting of compiling and linking our C source file with our binary with correct linker flags. To specify our build commands, we need to put a build.rs file at the crate root, which has the following contents:

// c_from_rust/build.rs

fn main() {
cc::Build::new().file("mystrlen.c")
.static_flag(true)
.compile("mystrlen");
}

We created a new Build instance and passed the C source filename with the static flag set to true before giving a name to our static object file to the compile method. Cargo runs the contents of any build.rs file before any project files get compiled. Upon running code from build.rs, the cc crate automatically appends the conventional lib prefix in C libraries, so our compiled static library gets generated at target/debug/build/c_from_rust-5c739ceca32833c2/out/libmystrlen.a.

Now, we also need to tell Rust about the existence of our mystrlen function. We do this by using extern blocks, where we can specify items that come from other languages. Our main.rs file is as follows:

// c_from_rust/src/main.rs

use std::os::raw::{c_char, c_uint};
use std::ffi::CString;

extern "C" {
fn mystrlen(str: *const c_char) -> c_uint;
}

fn main() {
let c_string = CString::new("C From Rust").expect("failed");
let count = unsafe {
mystrlen(c_string.as_ptr())
};
println!("c_string's length is {}", count);
}

We have a couple of imports from the std::os::raw module that contain types that are compatible with primitive C types and have names close to their C counterparts. For numeric types, a single letter before the type says whether the type is unsigned. For instance, the unsigned integer is defined as c_uint. In our extern declaration of mystrlen, we take a *const c_char as input, which is equivalent to char * in C, and return a c_uint as output, which maps to unsigned int in C. We also import the CString type from the std::ffi module, as we need to pass a C-compatible string to our mystrlen function. The std::ffi module contains common utilities and types that make it easy to perform cross language interactions.

As you may have noticed, in the extern block, we have a string, "C", following it. This "C" specifies that we want the compiler's code generator to confirm to the C ABI (cdecl) so that the function-calling convention follows exactly as a function call that's done from C. An Application Binary Interface (ABI) is basically a set of rules and conventions that dictate how types and functions are represented and manipulated at the lower levels. The function-calling convention is one aspect of an ABI specification. It's quite analogous to what an API means for a library consumer. In the context of functions, an API specifies what functions you can call from the library, while the ABI specifies the lower-level mechanism by which a function is invoked. A calling convention defines things such as whether function parameters are stored in registers or on the stack, and whether the caller clears the register/stack state or the caller when the function returns, and other details. We could have also ignored specifying this, as "C" (cdecl) is the default ABI in Rust for items that are declared in an extern block. The cdecl is a calling convention that's used by most C compilers for function calls. There are also other ABIs that Rust supports such as fastcall, cdecl, win64, and others, and these need to be put after the extern block based on what platform you are targeting.

In our main function, we use a special version of a CString string from the std::ffi module because strings in C are null terminated, while Rust one's aren't. CString does all the checks for us to give us a C-compatible version of strings where we don't have a null 0 byte character in the middle of the string and ensures that the ending byte is a 0 byte. The ffi module contains two major string types:

  • std::ffi::CStr represents a borrowed C string that's analogous to &str. It can be used to reference a string that has been created in C.
  • std::ffi::CString represents an owned string that is compatible with foreign C functions. It is often used to pass strings from Rust code to foreign C functions.

Since we want to pass a string from the Rust side to the function we just defined, we used the CString type here. Following that, we call mystrlen in an unsafe block, passing in the c_string as a pointer. We then print the string length to standard output.

Now, all we need to do is run cargo run. We get the following output:

The cc crate automatically figures out the correct C compiler to call. In our case, on Ubuntu, it automatically invokes gcc to link our C library. Now, there are a couple of improvements to be made here. First, it is awkward that we have to be in an unsafe block to call the function as we know it's not unsafe. We know our C implementation is sound, at least for this small function. Second, we will panic if CString creation fails. To solve this, we can create a safe wrapper function. In a simplistic form, this just means creating a function that calls the external function inside an unsafe block:

fn safe_mystrlen(str: &str) -> Option<u32> { 
let c_string = match CString::new(str) {
Ok(c) => c,
Err(_) => return None
};

unsafe {
Some(mystrlen(c_string.as_ptr()))
}
}

Our safe_mystrlen function returns an Option now, where it returns None if CString creation fails and, following that, calls mystrlen wrapped in an unsafe block, which is returned as Some. Calling safe_mystrlen feels exactly like calling any other Rust function. If possible, it's recommended to make safe wrappers around external functions, taking care that all exceptional cases happening inside the unsafe block are handled properly so that library consumers don't use unsafe in their code.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset