The Probably Simplest x86 Driver Written in Assembly – Printing to QEMU’s debugcon-Device
This is an educational resource I wish I had when I started digging into low-level and operating systems during my early time of my studies at the university. However, dear past me, I got your back!
Do you want to see a minimal way to communicate with (virtual) hardware and all of that written in pure assembly? Here you go. But at first, a few background information. How do we communicate with hardware? All hardware is mapped through memory or (only on x86) also through I/O ports. In both cases, hardware is reached through reads and write to certain locations. Memory is accessed with mov
instructions whereas I/O ports are reached with in
/out
instructions. But let’s deep dive into some code!
Introduction
I created a minimal project that demonstrates the possibly most simple x86 driver written in assembly: A driver for the debugcon device of QEMU. The device can be configured for x86 guests and is accessible via I/O port 0xe9
. It passes data to the outside world byte by byte. This way, we can print, for example, ASCII-strings from within the VM to the QEMU Virtual Machine Monitor. QEMU can be configured to forward the data to stdout
or to a file.
I created a GitHub repository. I recommend opening it side-by-side with this blog post so that you can look at the code. The most important file is main.S
. This file contains the code for a kernel-like binary that can be booted via Multiboot1. This helps us to run it easily in QEMU. All assembly code uses AT&T-Syntax and directives from GNU Assembler (GAS). This is also compatible with LLVM’s assembler, i.e., you can use the same code in a Rust project.
I show you in four steps building on each other how we build the kernel and the driver:
- Prepare the basic build setup.
- Set up a stack so that we can use
call
/ret
. - Write the driver code.
- Prepare the function argument (the string to print) and invoke the driver.
By the way: Additionally, this project can also be seen as a basic playground for small assembly experiments from now on, as one can change code and run it immediately in QEMU without much setup needed.
1. Prepare the Basic Build Setup
We need a main.S
file with the following content:
# "ax": section flags. Relevant for the section flags in the object file. # Without those flags, the linker doesn't put this section into a LOAD segment as we expect. .section .init_asm, "ax" /* Multiboot v1 Header. Required so that we can be booted by QEMU via the "-kernel" parameter. */ __boot_header_mbi1: .long 0x1badb002 .long 0x0 .long -0x1badb002 # We are booted by Multiboot1. We end up in 32-bit protected mode (without paging by the way). # Thus, we produce 32-bit code. 64-bit code has other opcodes! .code32 # Start symbol. Referenced in linker script. Entry in the ELF. .global start start: # move "0xbadboo1" immediate into register edx and hlt the processor mov $0xbadboo1, %edx cli hlt ud2
The corresponding linker script may look like this:
/* Symbol comes from main.S */ ENTRY(start) OUTPUT_FORMAT("elf32-i386") OUTPUT_ARCH("i386:i386") /* Program headers. Also called segments. */ PHDRS { init_asm PT_LOAD; } SECTIONS { .init_asm 8M : { *(.init_asm) } : init_asm /DISCARD/ : { *(.note.*) *(.eh_frame*) *(.got*) } }
It can be compiled with the two commands
gcc -m32 -c -o main.o main.S
ld -o $@ -Tlink.ld main.o
Please look into the Makefile
of the project. You can just invoke everything with make
.
2. Set up a stack so that we can use call/ret.
The behavior of function calling and returning from them in high level code maps to the x86
instructions call
and ret
. In order for that to work, we need a stack. The stack is pointed to by register esp
. After the handoff from the Multiboot1 bootloader, we do not necessarily have a valid stack. Thus, we set up one by ourselves. We need some backing memory for that. One easy option is to embed the memory inside the ELF file of the kernel. We do so by adding the following to main.S
:
# Backing storage for a minimal stack. The memory will be static inside the final ELF. .align 4 boot_stack_begin: # 1 KiB stack is more than enough for the boot code .fill 1024, 1, 0 boot_stack_end:
We also need to add mov $boot_stack_end, %esp
to our start function. That’s it, our stack works! Note that the stack grows downwards. Thus, we set the stack pointer in register esp
to the top of the stack.
3. Write the Actual Driver Code
Now, let’s write the code of the driver! I’ve split the driver into two functions: print a single character and print a null-terminated string to the debugcon device. Let’s start with the debugcon_print_byte
function:
# Prints a character to the QEMU debugcon port. # # Parameters: # 1: %al - Character to print. # Clobbers: %edx/%dx # # Example: # ```asm # mov $'o', %al # call debugcon_print_byte # ``` debugcon_print_byte: # 0xe9 => I/O port of debugcon device # movw: 16 bit movw $0xe9, %dx out %al, %dx ret
As you can see, we need to think about a calling convention by ourselves. As we print bytes, I decided to use the al
register which describes the lowest 8 bits of the eax
register. When the function is done, it ret
(returns) to the function that called it. This information can be found on the stack. This is handled automatically by x86 via call
and ret
pairs.
Now, we want to print a whole null-terminated string. Try to think how you would program this in C or Rust: You have a pointer to the string. To get the characters, you have to dereference it, thus, load the value behind the pointer. This dereferenced byte is the next character or zero. We can do this within a loop!
# Prints a C-style string to the QEMU debugcon port. # # Parameters: # 1: %eax - Pointer to the begin of the string. # Clobbers: %eax, %ecx debugcon_print_string_until_null: # load byte behind string pointer movb (%eax), %cl # null byte check cmp $0, %cl # jump if cmp was true jz out # prepare function arguments; %cl => %al xchg %eax, %ecx call debugcon_print_byte # restore string pointer in %eax xchg %eax, %ecx # increase string pointer inc %eax # do again for next byte of string jmp debugcon_print_string_until_null out: ret
4. Prepare the function argument (the string to print) and invoke the driver.
Now, we have to embed a null-terminated string in the binary, pass its pointer to debugcon_print_string_until_null
and the magic happens.
# Place a null-terminated ASCII string statically/"as is" inside the binary. hello_world_str: .asciz "Hello World from Assembly Code\n" .code32 # Start symbol. Referenced in linker script. Entry in the ELF. .global start start: # Set up stack so that we can use "call" and "ret". mov $boot_stack_end, %esp # Prepare function argument: string pointer into register eax mov $hello_world_str, %eax call debugcon_print_string_until_null # pause processor: clear interrupts and halt cli hlt ud2
If we execute this with make run
we can see that QEMU prints the bytes written to the debugcon device to stdout. That’s it. That’s the possibly simplest x86 driver possible. If you wonder what $hello_world_str
means: This is replaced with the link address of the hello_world_str
symbol during build/link time.
make && make run
will print something like this:
./run_qemu.sh Executing: qemu-system-i386 -nodefaults -monitor vc -kernel kernel -vga std -machine q35,accel=kvm:tcg -m 16M -debugcon stdio -no-reboot ------- Hello World from Assembly Code
objdump -DSC kernel
shows us the following disassembly of the final binary (excerpt):
0080002c <start>: 80002c: bc 58 04 80 00 mov $0x800458,%esp 800031: b8 0c 00 80 00 mov $0x80000c,%eax 800036: e8 04 00 00 00 call 80003f <debugcon_print_string_until_null>
We can see how the stack is set up and how the function argument (the string pointer) is prepared. Pretty neat, isn’t it?
0 Comments