Foreword
Hermes is a capability-based microkernel. As a microkernel, Hermes is responsible for certain core operations of an operating system, including:
- Physical memory allocation
- Address space isolation
- Task scheduling
- Interprocess communication
- Interrupt handling and dispatch
- Capability rights enforcement
- Userspace I/O port mediation
- Userspace MMIO mediation
Hermes is a low-level part of a broader operating system design and requires a separate set of drivers and low-level services to provide a higher-level application platform (e.g. POSIX).
Hermes is designed to replace its predecessor, the Helios microkernel.
Important
This documentation is a work in progress. Pages indicated with “🔸” in the sidebar are stubs.
High-level design
The high-level design of Hermes is to a certain extent inspired by the design of the seL4 kernel, but with many notable differences.
Hermes schedules tasks to execute user code on available CPUs. Each task is provided with a capability space and an address space, either of which may be shared with other tasks in whole or in part, allowing for the implementation of threads, processes, or shared memory IPC.
The capabilities which are available in a task’s capability space define the scope of I/O and IPC operations available to that task, and task isolation is achieved by curating the list of capabilities available to a task. These capabilities may be based on IPC, to communicate with services running in other tasks or processes, or may offer other rights, such as access to IRQs and memory-mapped I/O, to facilitate the implementation of drivers in userspace.
Getting Hermes
The source code for Hermes is available on SourceHut under the GNU GPLv3 license. Information about dependencies and compiling the kernel are provided in the repository.
Booting Hermes
Bootloaders are provided for each supported target in the kernel repository. See System initialization for information about giving the kernel some work to do once booted.
x86_64 EFI
An EFI bootloader is available in ./boot/efi and will be compiled if
ENABLE_EFI=1 in your config.mk file. The built EFI bootloader is written to
./boot/efi/bootx64.efi and should be installed at /EFI/boot/bootx64.efi on
the EFI System Partition of your boot media.
The EFI bootloader loads the kernel from /hermes of the boot media and loads
boot modules from /modules in (ASCII) alphabetical order. The first boot
module is used as init by the kernel.
x86_64 multiboot
A multiboot-compatible legacy BIOS bootloader is provided in ./boot/multiboot
and will be compiled if ENABLE_LEGACY=1 in your config.mk file. The
bootloader is written to ./boot/multiboot/sysboot.mb and should be loaded by a
multiboot-compatible bootloader such as syslinux or grub.
Multiboot modules are loaded as boot modules and passed to the kernel in the order defined by the multiboot environment. The first boot module is used as init by the kernel.
System design
Hermes is designed primarily around the use of capabilities, which represent an unforgeable object which offers its bearer various operations associated with a kernel object, such as a page of memory or an address space.
Capabilities represent rights for various object types, including resources managed by the kernel or IPC objects used to communicate with services and other processes. Capabilities supported by the kernel are enumerated and documented in the Capability API.
The following verbs are associated with capabilities:
- Send: A send operation sends a message to a capability.
- Recv: A receive operation receives a message from a capability.
- Call: A call operation sends a message to a capability and then blocks until a reply is received.
- Reply: Replying to a call will unblock the sender and deliver the outcome of an operation to it.
- Invoke: Calling, receiving, or replying are all ways to invoke that capability.
- Transfer: IPC interactions via endpoints may cause capabilities to be transferred, copying or moving them from one task to another.
A capability resides in a capability slot, or “cslot”. Most capabilities reside in a capability space, or “CSpace”, which provides addressable storage for capability slots.
Memory management
Memory capabilities provide access to general-purpose memory to the bearer, and from these capabilities, various kernel objects may be allocated, such as address spaces and page tables, tasks, IPC objects, and so on.
On startup, after loading the kernel and the user init program, all remaining general purpose memory is enumerated and provided to the user init program in the form of these Memory capabilities. The user init program may allocate resources from their Memory capabilities, subdivide them into further Memory capabilities, or transfer them to other processes.
Allocating objects
Objects are allocated via Memory::ALLOC. Some objects have a fixed size,
such as Tasks, while others accept a size parameter that governs the size of the
resulting object. For example, a CSpace has a radix parameter that determines
the number of available capability slots.
Reference counts
Objects allocated from memory capabilities have reference counts. Copying or destroying the capability associated with an object generally updates its reference count. The maximum number of references to any kernel object is 256, further attempts to copy capabilities that reference such objects will result in ERANGE errors.
Overhead
Each memory area stores a bitmap of allocations and a list of reference counts, as well as some metadata. The overhead is subtracted from the total amount of available memory in a Memory capability.
Contiguous allocations
Over time, a Memory capability can become fragmented, and allocations are not guaranteed to occupy continuous ranges of physical memory.
If contiguous allocations are required, for example to establish a large area for DMA, it is recommended to allocate a new Memory capability and then allocate Pages from that capability. Allocations on a new Memory capability are guaranteed to be contiguous until the first object is freed, after which point the allocations may be non-contiguous.
Object sizes
The amount of memory required for a kernel object is either a fixed size, or is a function of its initialization parameters. The semantics for memory allocation are described on each allocatable capability’s page in the documentation.
By scrutinizing these semantics, userspace programs may reckon the amount of
memory available in a Memory capability before and after any allocation. This
information may also be queried at runtime via Memory::GET_FREE_PAGES.
Capability spaces
Address spaces
Tasks
Interprocess Communication (IPC)
Fault handling
Memory-mapped I/O and I/O ports
Interrupt processing
System initialization
Initial capabilities
Bootinfo
System calls
Access to kernel resources and IPC primitives is accomplished primarily through system calls (syscalls). There are a small number of syscalls that are used primarily to interact with capabilities. Note that the syscall API differs from the capability API, which is largely built on top of of SYS_call.
Common ABI types
Type declarations for Hare code are provided by the uapi module in the kernel source tree.
| Type | Description |
|---|---|
| u8 | 8-bit unsinged integer |
| u16 | 16-bit unsinged integer |
| u32 | 32-bit unsinged integer |
| u64 | 64-bit unsinged integer |
| uaddr | 64-bit memory address |
| caddr | 32-bit capability address |
| ctype | 8-bit capability type |
Capability types
| ID | Type | Description |
|---|---|---|
| 0 | NULL | Empty capability slot |
| 1 | MEMORY | General-purpose memory |
| 2 | CSPACE | Capability space |
| 3 | VSPACE | Virtual address space |
| 4 | TASK | Schedulable task |
| 5 | PAGE | Page of memory |
| 6 | NOTIFICATION | IPC notification |
| 7 | ENDPOINT | IPC endpoint |
| 8 | REPLY | IPC reply |
x86_64 specific
| ID | Type | Description |
|---|---|---|
| 9 | PDPT | Page-directory pointer table |
| 10 | PD | Page directory |
| 11 | PT | Page table |
| 12 | IOCONTROL | I/O control |
| 13 | IOPORT | I/O port |
| 14 | IRQCONTROL | IRQ control |
| 15 | IRQ | IRQ |
x86_64 ABI
The x86_64 syscall ABI is based on the System-V ABI. All arguments are passed in registers. Note that floating point registers are not used by the ABI (and are preserved by the kernel).
Input registers
| Register | Purpose |
|---|---|
| %rax | Syscall number (8 bits) & flags |
| %rdi | (a1) 1st argument register |
| %rsi | (a2) 2nd argument register |
| %rdx | (a3) 3rd argument register |
| %r10 | (a4) 4th argument register1 |
| %r8 | (a5) 5th argument register |
| %r9 | (a6) 6th argument register |
Output registers
| Register | Purpose |
|---|---|
| %rax | Syscall outcome (8 bits) + 56 syscall-specific bits |
| %rsi | (r1) 1st return register |
| %rdx | (r1) 2nd return register |
The syscall outcome is 0 on success or an error code on failure.
Other registers
| Register | Purpose |
|---|---|
| %r12-%r15 | Kernel-saved |
| %rbp | Kernel-saved |
| %rbx | Kernel-saved |
| %fs, %gs | Kernel-saved |
Note
Certain syscalls, notably SYS_call and SYS_recv, differ from the standard register allocation.
Error codes
The following error codes are defined:
| Code | Name | Description |
|---|---|---|
| 0 | n/a | Indicates a successful outcome. |
| 1 | EWRONGTYPE | An incorrect resource type was used in an operation. |
| 2 | ENOMEM | Insufficient memory available for requested operation. |
| 3 | EINVALID | An invalid parameter was supplied for an operation. |
| 4 | EINVALCADDR | An invalid capability address was used. |
| 5 | ERANGE | A parameter exceeds the permissible range. |
| 6 | EEXIST | A resource already exists at the given address. |
| 7 | ENOENT | A required resource does not exist. |
| 8 | EBUSY | A required resource is currently in use. |
| 9 | ENOTSUP | The requested operation is not supported. |
| 10 | ENOSYS | Invalid syscall or function. |
| 11 | EFAULT | Use of invalid address. |
| 12 | EL1PT | VSpace mapping is missing a required level 1 page table. |
| 13 | EL2PT | VSpace mapping is missing a required level 2 page table. |
| 14 | EL3PT | VSpace mapping is missing a required level 3 page table. |
| 15 | EL4PT | VSpace mapping is missing a required level 4 page table. |
| 16 | EDESTROYED | A resource was destroyed during the operation. |
-
Note that the System-V ABI assigns the 4th argument to %rcx. ↩
SYS_schedule
Yields the current task slice to the scheduler.
Inputs
None.
Outputs
None.
SYS_cpu_stat
Returns timing statistics about a given CPU.
Inputs
| Register | Type | Purpose |
|---|---|---|
| a1 | u32 | Target CPU ID |
| a2 | uaddr | cpu_stat structure |
The ID of each CPU is noted in the bootinfo structure’s cpu_info list.
cpu_stat structure
Each field of the cpu_stat structure represents the amount of time this CPU has spent in each state from an arbitrarily defined epoch. Time increases monotonically at the rate defined by the tick_rate parameter of the bootinfo structure.
// CPU usage statistics.
export type cpu_stat = struct {
idle: u64,
user: u64,
kern: u64,
};
Outputs
The cpu_stat structure is filled in with the latest timing info.
Errors
- ENOENT: No CPU by this ID was found
- EFAULT: The address of the cpu_stat structure is not valid
SYS_ident
Identifies a capability type.
Inputs
| Register | Type | Purpose |
|---|---|---|
| a1 | caddr | Capability address |
Outputs
| Register | Type | Purpose |
|---|---|---|
| r1 | ctype | Capability type |
Errors
- EINVALCADDR: Capability address is invalid