Include Assembly Source Files In Rust Project (and Build with Cargo)

Published by Philipp Schuster on

Update 2021-12-21: I included documentation references. Since Rust nightly 1.59 macros global_asm! and asm! are finally stable!

Recently I stumbled upon a Rust project, where *.S– and *.rs-files were side by side in the same Cargo source directory and I was like: “What the heck, I always wanted this!” At the time when I created this article, there was no documentation about this, or it was well hidden. If you’ve programmed in C/C++ before, you probably have seen that gcc compiles *.c and *.S files to *.o-files and the linker combines all of them together into a library or an executable. These *.S-files (the assembly) are written in GAS (GNU Assembler) syntax style/flavor.

When writing firmware or operating systems, it’s important to be able to define some functions in assembly language. For example, you can construct an aligned multiboot2-Header using assembly or initialize the stack and several other registers before you jump into the code of a higher level language like C or Rust. For each symbol you usually want to be capable to define the section, like .data, .text or .bss.

Rust provides the macros asm!– and global_asm. Since Rust 1.59 nightly they are stable, don’t require a feature flag anymore, and will hopefully soon be in the stable channel of Rust. Definitions of symbols/functions written in the assembly must live in the global_asm! -space, because inline assembly (asm!) in Rust is limited to assembly instructions. Rust uses a GAS-like syntax with intel syntax flavor by default for all assembly [GitHub PR, early language reference preview]. It forwards all of the assembly source code to LLVM. Therefore, the format stays the same for different architectures, but of course, not all architectures support all instructions.

Quote from the language reference:

Currently, all supported targets follow the assembly code syntax used by LLVM’s internal assembler which usually corresponds to that of the GNU assembler (GAS). On x86, the .intel_syntax noprefix mode of GAS is used by default. On ARM, the .syntax unified mode is used. These targets impose an additional restriction on the assembly code: any assembler state (e.g. the current section which can be changed with .section) must be restored to its original value at the end of the asm string. Assembly code that does not conform to the GAS syntax will result in assembler-specific behavior.

https://github.com/rust-lang/reference/blob/cf3a28145e06a3294494b5ac2ac4beef9f2e52e0/src/inline-assembly.md

Down below you can find an example.

// `global_asm!` and `asm!` are stable features since Rust nightly 1.59.
// Before that version, you need `#![feature(global_asm)] and the import 
// of `core::arch::global_asm` is not required.

use core::arch::global_asm;

global_asm!("
    foo:
      mov rax, 1337
      ret
");
// or with include_str!: relative to src directory
// Cargo will rebuild the file if "foo.S" changes.
global_asm!(include_str!("foo.S"));

extern "C" {
    fn foo() -> u64;
}

fn main() {
    unsafe {
        println!("foo : {}", foo());
    }
}

This way $ cargo build will also take care of the assembly files. Furthermore, cargo notices if one of these files changes and recompiles the necessary parts. I didn’t trace the internal process but I guess it creates a separate object file and links it together with the object file from the Rust code.

I made a project on GitHub that uses this feature to include global functions written in assembly. The official documentation in the language reference will be available on https://doc.rust-lang.org/stable/reference/, as soon as this PR gets merged.


Philipp Schuster

Hi, I'm Philipp and interested in Computer Science. I especially like low level development, making ugly things nice, and de-mystify "low level magic".

0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *