As a systems language, Rust is able to interact with other programming languages with zero extra overhead. This point of interaction is referred to as the Foreign Function Interface, where the term "foreign function" refers to functions that the compiler has no way of inspecting (e.g. because it's from a DLL written in another language).

At the moment, Rust only supports FFI through the C ABI, as that is the most common and stable ABI in existence, and it allows Rust to interact with almost all libraries. In the future, Rust may support other ABIs such as the Swift ABI, and it may even get its own stable ABI.

Declaring the existence of foreign functionsEdit

Before code can call external functions, declarations need to be created so the compiler knows about the functions and their signatures. In Rust this is done using extern blocks.

For example, to declare an add() function which accepts two integers and returns an integer you would write a declaration like this:

extern {
    fn add(a: u32, b: u32) -> u32;
}

The name of the imported item doesn't have to be the same as the symbol exported from the DLL.

extern {
    #[link_name = "AddTwoNumbers"]
    fn add(a: u32, b: u32) -> u32;
}

While the name of the function is add() on the Rust side, the DLL should export a function called AddTwoNumbers.[1]

Extern functions in Rust also support variadic arguments (an unspecified number of arguments of any type), as used in C functions such as printf:

extern {
    fn printf(format: *const c_char, ...) -> i32;
}

Declarations like these can be automatically generated from C headers using  bindgen .

You can instruct the Rust compiler to link against certain static or dynamic libraries using the link attribute[2]:

// Link this extern block to the dynamic library mylib.
// For example, in Linux this might link against /usr/lib/libmylib.so.
#[link(name = "mylib")] extern {}

// The same as the above example, since "dylib" is the default "kind".
#[link(name = "mylib" kind="dylib"] extern {}

// Link this extern block to the static library mylib.
// For example, in Linux this might link against /usr/lib/libmylib.a.
#[link(name = "mylib", kind="static")] extern {}

// Link this extern block to the macOS framework mylib.
// For example, in macOS this might link against /Library/Frameworks/mylib.framework.
// Beware that frameworks are only supported on macOS.
#[link(name = "mylib", kind="framework")] extern {}

In addition to the kinds demonstrated above, there's also another kind being developed called raw-dylib. raw-dylib is first and foremost being developed so that Rust code may link against Windows DDLs without using import libraries. raw-dylib was proposed in RFC 2627: raw-dylib-kind.

Calling foreign functionsEdit

By definition a foreign function contains code that the Rust compiler is unable to read or analyse, making it impossible for the borrow checker to ensure it is following Rust's rules around ownership, lifetimes, and borrowing. That means any code calling a foreign function must be done from an unsafe block.

For example, to call the previous add() function:

fn main() {
    unsafe {
        let four = add(2, 2);
        assert_eq!(four, 4);
    }
}

This requirement to use unsafe means FFI and unsafe are closely related, with a lot of conceptual overlap.

Linking with foreign codeEdit

When someone talks about "linking with a foreign library," they usually mean using a C library. Since C library standards are different from Rust, and they can be different depending on the operating system, a bit of metadata needs to be specified for the Rust toolchain to work with it. This winds up with the library embedded in the final executable (a "statically linked" C library) or with some metadata in the executable that tells the operating system to pull in the library when you run it (a "dynamically linked" C library).

The most common method of linking foreign code in Rust is by writing a Cargo Crate#Build_scripts build.rs script. A build scripts is compiled to an executable and run by cargo, and the special instructions that it prints are read and passed to the compiler.

// build.rs
fn main() {
    println!("cargo:rustc-link-lib=mylib");
    println!("cargo:rustc-link-search=/opt/mylib/lib");
}

Since different deployments and versions might put the libraries in different places (meaning rustc-link-search will be different), they go through a probe tool like  pkg-config (used on Linux and macOS),  vcpkg (used on Windows), or both. You can get even more sophisticated with target env tweaks, like libz-sys for example, but this is basically how it works.

fn main() {
    #[cfg(target_env = "msvc")] {
        vcpkg::Config::new()
            .emit_includes(true)
            .probe("mylib")
            .unwrap();
    }
    #[cfg(not(target_env = "msvc"))] {
        pkg_config::Config::new()
            .cargo_metadata(true)
            .print_system_libs(false)
            .probe("mylib")
            .unwrap();
    }
}

To access these libraries from the build script, add them to Cargo.toml as build-dependencies.

[build-dependencies]
pkg-config = "0.3.9"

[target.'cfg(target_env = "msvc")'.build-dependencies]
vcpkg = "0.2"

If the library is open source (or if you won't be distributing your wrapper crate via crates.io), then you could also include code to compile the C library directly, making this a complete Cargo packaging of the library, rather than just a shim that probes for a copy of the library that's already installed. Common options are  cmake and the  cc crate to easily build them.

Conventionally, a Rust crate that wraps a C library to link with it will only contain the code to link to it and the bare minimum unsafe extern FFI definitions. These crates are conventionally called sys crates. This way, multiple crates that provide safe APIs on top of the same library can share their probe/compile code, and crates that depend on both of them won't wind up liking to the same C library twice (which usually doesn't work very well).

Common errors when linking with FFI librariesEdit

skipping incompatible /home/me/project/target/my-target/release/build/crate/out/lib/libmylib.so when searching for -lmylib

This probably means the C compiler isn't building for the same target that the Rust compiler is building for. Usually, a good sys crate will ensure that it builds libraries for the correct target architecture, but you may need to manually specify the right C compiler to use with environment variables, as documented in the cc crate:

$ export CC=mipsel-linux-gnu-gcc
$ cargo build --target mipsel-unknown-linux-musl

Exporting Rust functions to other languagesEdit

A Rust library can be exposed to other languages by compiling as a cdylib or cstaticlib, which emulate the behaviour of C libraries. The Rust compiler has a [--crate-type] command line option, and cargo has a [crate-type] manifest option, to do this.

In the library itself:

  • The [extern "C"] attribute marks a function as using the C calling convention, which says how function parameters should be laid out in registers and the call stack. This is needed so that your Rust function can be called through a C function pointer (which is sort of like a dynamic closure).
  • The #[no_mangle] attribute stops Rust from adding the crate and module to the name of the function. This is needed so that your Rust function can be called by name from a C project.
#[no_mangle]
extern "C" fn runrust_hello_world() {
    println!("Hi!");
}

To invoke this function from a C project, your library's users would want a header file. Header files can be automatically generated by  cbindgen , or written by hand.

#ifndef RUNRUST_HELLO_WORLD_H
#define RUNRUST_HELLO_WORLD_H

void runrust_hello_world();

#endif /* ifndef RUNRUST_HELLO_WORLD_H */

Diagnosing problems with nm and readelf (on Linux)Edit

Once you've created a C library using rustc and/or cargo, you can look at it using some of C's debugging tools, like readelf and nm. If nm outputs something like this on my Linux system:

$ rustc --crate-type=cdylib test.rs
$ nm -g libtest.so
                 U abort@@GLIBC_2.2.5
                 U bcmp@@GLIBC_2.2.5
                 U calloc@@GLIBC_2.2.5
                 U close@@GLIBC_2.2.5
                 w __cxa_finalize@@GLIBC_2.2.5
                 w __cxa_thread_atexit_impl@@GLIBC_2.18
                 U dl_iterate_phdr@@GLIBC_2.2.5
                 U __errno_location@@GLIBC_2.2.5
                 U free@@GLIBC_2.2.5
                 U __fxstat64@@GLIBC_2.2.5
                 U getcwd@@GLIBC_2.2.5
                 U getenv@@GLIBC_2.2.5
                 w __gmon_start__
                 w _ITM_deregisterTMCloneTable
                 w _ITM_registerTMCloneTable
                 U malloc@@GLIBC_2.2.5
                 U memchr@@GLIBC_2.2.5
                 U memcpy@@GLIBC_2.14
                 U memmove@@GLIBC_2.2.5
                 U memrchr@@GLIBC_2.2.5
                 U memset@@GLIBC_2.2.5
                 U mmap@@GLIBC_2.2.5
                 U munmap@@GLIBC_2.2.5
                 U open64@@GLIBC_2.2.5
                 U posix_memalign@@GLIBC_2.2.5
                 U pthread_getspecific@@GLIBC_2.2.5
                 U pthread_key_create@@GLIBC_2.2.5
                 U pthread_key_delete@@GLIBC_2.2.5
                 U pthread_mutexattr_destroy@@GLIBC_2.2.5
                 U pthread_mutexattr_init@@GLIBC_2.2.5
                 U pthread_mutexattr_settype@@GLIBC_2.2.5
                 U pthread_mutex_init@@GLIBC_2.2.5
                 U pthread_mutex_lock@@GLIBC_2.2.5
                 U pthread_mutex_trylock@@GLIBC_2.2.5
                 U pthread_mutex_unlock@@GLIBC_2.2.5
                 U pthread_rwlock_rdlock@@GLIBC_2.2.5
                 U pthread_rwlock_unlock@@GLIBC_2.2.5
                 U pthread_setspecific@@GLIBC_2.2.5
                 U readlink@@GLIBC_2.2.5
                 U realloc@@GLIBC_2.2.5
0000000000005170 T runrust_hello_world
00000000000237a0 T rust_eh_personality
                 U strlen@@GLIBC_2.2.5
                 U syscall@@GLIBC_2.2.5
                 U __tls_get_addr@@GLIBC_2.3
                 U _Unwind_Backtrace@@GCC_3.3
                 U _Unwind_GetDataRelBase@@GCC_3.0
                 U _Unwind_GetIP@@GCC_3.0
                 U _Unwind_GetIPInfo@@GCC_4.2.0
                 U _Unwind_GetLanguageSpecificData@@GCC_3.0
                 U _Unwind_GetRegionStart@@GCC_3.0
                 U _Unwind_GetTextRelBase@@GCC_3.0
                 U _Unwind_RaiseException@@GCC_3.0
                 U _Unwind_Resume@@GCC_3.0
                 U _Unwind_SetGR@@GCC_3.0
                 U _Unwind_SetIP@@GCC_3.0
                 U write@@GLIBC_2.2.5
                 U writev@@GLIBC_2.2.5
                 U __xpg_strerror_r@@GLIBC_2.3.4

The symbols that start with W are "weak bindings," meaning that libtest.so pulls them from the operating system, instead of providing them itself. The ones with U are also weakly bound, but they also are required to be "unique", so there isn't more than one in the process. The only two symbols that are actually part of this library are the ones marked "T": runrust_hello_world, which is the function written above, and rust_eh_personality, which is called if your library invokes panic!.

If #[no_mangle] hadn't been added to the function definition, we would run into a gotcha and get this result:

$ rustc --crate-type=cdylib test.rs
warning: function is never used: `runrust_hello_world`
 --> test.rs:2:15
  |
2 | extern "C" fn runrust_hello_world() {
  |               ^^^^^^^^^^^^^^^^^^^
  |
  = note: `#[warn(dead_code)]` on by default

warning: 1 warning emitted

$ nm libtest.so | grep runrust
$

The function was entirely stripped, because nothing in Rust called it, so it was assumed to be unused. To fix this, the library can provide a function pointer, like this, and it'll show us the "mangled" function name.

extern "C" fn runrust_hello_world() {
    println!("Hi!");
}
#[no_mangle]
extern "C" fn runrust_export() -> extern "C" fn() {
  runrust_hello_world
}
$ rustc --crate-type=cdylib test.rs
$ nm libtest.so | grep runrust
00000000000051b0 T runrust_export
0000000000005170 t _ZN4test19runrust_hello_world17h56d7455090c441dfE

Its calling convention is C-compatible, so it can be called through the function pointer, but its name is a mixture of the name of the crate ("test"), the name of the function ("runrust_hello_world"), and a fingerprint. Also, the little "t" means that the symbol is "local", so normal C software won't be able to call the function by name even if it does know it, because the dynamic linker won't resolve that name.

Creating safe abstractionsEdit

Main article: Unsafe § Building safe abstractions

Functions that Rust loads through FFI are usually marked unsafe, and you'll usually make them safe. How to do that?

Wrapping unsafe functions and interfacesEdit

Those can be wrapped using the [newtype pattern], defining a type like this one in  nix :

/// Clock identifier
///
/// Newtype pattern around `clockid_t` (which is just alias). It pervents bugs caused by
/// accidentally passing wrong value.
#[derive(Debug, Copy, Clone, Eq, PartialEq, Ord, PartialOrd, Hash)]
pub struct ClockId(clockid_t);

A plain clockid_t can be created using numeric constants, but a ClockId struct can't be accidentally created that easily. The library should also take care of invoking the functions themselves in a safe way:

// Short comments are added for this wiki.

/// Get the resolution of the specified clock, (see
/// [clock_getres(2)](https://pubs.opengroup.org/onlinepubs/7908799/xsh/clock_getres.html)).
pub fn clock_getres(clock_id: ClockId) -> Result<TimeSpec> {
    // The C library being wrapped uses a common idiom called an "out pointer",
    // where instead of returning a data structure (in this case, `timespec`),
    // it accepts a pointer and writes the `timespec` instance to it.
    // This way, the return value can be used for the `errno` number.
    // If the returned errno number is zero, then the timespec should have been
    // written to. Otherwise, it should be discarded.
    let mut c_time: MaybeUninit<libc::timespec> = MaybeUninit::uninit();
    let ret = unsafe { libc::clock_getres(clock_id.as_raw(), c_time.as_mut_ptr()) };
    // `Errno::result` is a function inside Nix to make converting errno to Result more easy.
    // If `ret` is zero, this line does nothing. Otherwise, it returns from the function.
    Errno::result(ret)?;
    // At this point, we know that `errno` is zero. Unless there's a bug in libc, it should
    // be safe to assume that c_time was written to by libc::clock_getres.
    let res = unsafe { c_time.assume_init() };
    // nix::TimeSpec is to libc::timespec as nix::ClockId is to libc::clockid_t.
    Ok(TimeSpec::from(res))
}

Preventing bad instances from being constructedEdit

nix provides a ClockId::from_raw function that is safe, because according to the POSIX specification, using an invalid clock ID produces an error (which is safe) instead of invoking Undefined Behaviour (which is not).

Other libraries cannot do this. For example, the  libtls library has a Tls::from_sys function that is not safe. It provides a detailed explanation of why this is, especially the ownership transfer:

// Short comments are added for this wiki.

// The libtls_sys crate is the raw libtls bindings.
// This code is from the libtls crate, which provides a safe abstraction.
#[derive(Debug)]
pub struct Tls(*mut libtls_sys::tls, RawFd);

/// Wrap a raw C `tls` object.
///
/// # Safety
///
/// This function assumes that the raw pointer is valid, and takes
/// ownership of the libtls object.
/// Do not call `tls_free` yourself, since the `drop` destructor will
/// take care of it.
///
/// # Panics
///
/// Panics if `tls` is a null pointer.
pub unsafe fn from_sys(tls: *mut libtls_sys::tls) -> Tls {
    // This would be a reasonable unsafe abstraction, even without this check.
    // It exists because libtls_sys functions often return null pointers,
    // 
    if tls.is_null() {
        panic!(io::Error::last_os_error())
    }
    Tls(tls, -1)
}

The libtls crate also provides safe methods to construct Tls objects, such as Tls::client

// Short comments are added for this wiki.

impl Tls {
    // Like the Errno::result function in Nix, this private "new" function in libtls is a convenient tool for
    // converting C-style errors into Rust-style errors.
    fn new(f: unsafe extern "C" fn() -> *mut libtls_sys::tls) -> io::Result<Self> {
        // `new` accepts a function pointer as a parameter, and invokes it, doing work to the provided function's result.
        // Invoking arbitrary C-style function pointers is not safe, and so this function isn't safe to call with arbitrary parameters,
        // but it is private to this module, so the libtls crate's users can't exploit it.
        let tls = unsafe { f() };
        // Unlike libc::clock_getres, libtls_sys constructors do not use out pointers. Instead, they return a null
        // pointer when there is an error, and otherwise return a pointer to an object allocated on the heap when
        // they succeeed. The cause of the error is stored it in a thread-local variable, which is
        // extracted and converted into a Result here.
        if tls.is_null() {
            Err(io::Error::last_os_error())
        } else {
            Ok(Tls(tls, -1))
        }
    }

    pub fn client() -> io::Result<Self> {
        Self::new(libtls_sys::tls_client)
    }

    pub fn server() -> io::Result<Self> {
        Self::new(libtls_sys::tls_server)
    }

    // more functions that use Self::new also exist ...
}

See alsoEdit

ReferencesEdit