fabricBIOS Resource Types
This document describes each resource type supported by fabricBIOS: wire encoding, enforcement mechanisms, lease lifecycle, and failure modes.
Overview
fabricBIOS defines a set of resource types that nodes advertise in their inventory. Each resource has a type code, capacity unit, and a set of operations for lease management and data-plane access.
See Premium Dataplane Methodology for the canonical reference on fabricBIOS’s premium dataplane model (RDMA, NVMe-oF, SR-IOV, GPU, CXL).
| Code | Constant | Type | Capacity Unit | Status |
|---|---|---|---|---|
| 0x0001 | RES_TYPE_CPU | CPU | Core count | Advertised; Linux topology-aware reservation implemented |
| 0x0002 | RES_TYPE_MEM | MEM | Bytes | Full data-plane (FBMU, RDMA) |
| 0x0003 | RES_TYPE_BLOCK | BLOCK | Bytes (or sectors) | Full data-plane (FBBU); premium target/session bindings for NVMe-oF |
| 0x0004 | RES_TYPE_NET | NET | Bits/sec | Advertised; Linux macvlan and SR-IOV support implemented |
| 0x0005 | RES_TYPE_GPU | GPU | VRAM bytes / compute units | Advertised; HIP/ROCm path implemented, CUDA discovery partial |
| 0x0010 | RES_TYPE_SCHEDULER | SCHEDULER | HTTP port | Service discovery via ANNOUNCE; capacity=port, available=1 |
| 0x00FF | — | VENDOR | Vendor-defined | Extension space |
Wire Encoding
Flat Inventory Format
Resources are advertised in GET_INVENTORY responses using a flat wire format:
count : u16repeated count times: resource_id : u128 (16 bytes) resource_type : u16 (2 bytes) flags : u16 (2 bytes) capacity : u64 (8 bytes) available : u64 (8 bytes)Each entry is 36 bytes. All integers are big-endian.
Structured Inventory Format
The ResourceInventory structure provides typed fields per resource category:
node_id : [u8; 32]epoch : u64health : u32mem_arenas : list of MemArenablock_devs : list of BlockDevcpu_caps : list of CpuCapnics : list of NicInfogpus : list of GpuInfoHealth and Flag Bits
Resource flags (flat format, flags field):
| Bit | Name | Meaning |
|---|---|---|
| 0 | FENCED | Resource is fenced; no new leases granted |
| 1 | DEGRADED | Resource is usable but with reduced guarantees |
Health constants (structured format):
| Value | Constant | Meaning |
|---|---|---|
| 0x00 | HEALTH_OK | Resource is healthy |
| 0x01 | HEALTH_DEGRADED | Degraded but usable |
| 0x02 | HEALTH_FENCED | Fenced, not leasable |
MEM (0x0002) — Memory
Description
Memory resources represent addressable memory regions that can be read and written by remote clients. On Linux, these are backed by mmap’d anonymous memory or file-backed regions. On Pi5 bare-metal, these are DMA arenas carved from physical DRAM.
Inventory Fields (MemArena)
resource_id : u128base : u64 (base address within node)len : u64 (size in bytes)align : u32 (alignment requirement)max_read : u32 (maximum read size per operation)max_write : u32 (maximum write size per operation)health : u32Enforcement Mechanisms
| Platform | Backing | Enforcement |
|---|---|---|
Linux (fabricbiosd) | mmap anonymous / file | Lease lookup per request; dp_key HMAC verification |
| Pi5 bare-metal | Static 65KB DMA arena (8KB per lease slot, 8 slots) | Per-lease dp_key + nonce replay cache (64-entry window) |
| Future: RDMA | ibverbs memory region | rkey rotation on expiry; QP teardown |
Lease Lifecycle
- LEASE_ALLOC (QUIC control): Creates a lease with specified duration and grace period. Returns binding TLVs:
lease_id,dp_key,endpoint(IP:port),limits(offset + length). - LEASE_RENEW (QUIC control): Extends lease expiry. Must be called within renewal window (last 20% of lease duration).
- LEASE_QUERY (QUIC control): Returns current lease status (ACTIVE, EXPIRED, REVOKED) and expiry time.
- Data-plane I/O (FBMU on UDP 5702):
- HELLO: Presents lease_id, receives resource_len and I/O limits.
- READ: offset + length -> data.
- WRITE: offset + length + data -> status.
- Each request carries lease_id, nonce, auth_tag (HMAC-SHA256 of header using dp_key).
- LEASE_FREE (QUIC control): Explicitly releases the lease. Subsequent data-plane ops return NO_LEASE.
- Expiry: Background
tick_leasesscan evicts expired leases. Data-plane ops on expired leases return NO_LEASE.
Failure Modes
- EXPIRED: Lease TTL elapsed. Data-plane returns
FBMU_STATUS_NO_LEASE. - REPLAY: Duplicate nonce detected. Data-plane returns
FBMU_STATUS_REPLAY. - OUT_OF_RANGE: Read/write exceeds allocated region. Returns
FBMU_STATUS_RANGE. - INVALID: Bad auth_tag or unknown lease. Returns
FBMU_STATUS_INVALID. - FENCED: Teardown failure. Resource reports FENCED in discovery; no new leases.
BLOCK (0x0003) — Block Storage
Description
Block storage resources expose a fixed-size block device with sector-aligned read/write access. On Linux, these are backed by files or raw block devices with O_DIRECT. On Pi5 bare-metal, the NVMe HAT (PCIe1) provides a real NVMe SSD, or an SD card can serve as block backing.
Inventory Fields (BlockDev)
resource_id : u128block_size : u32 (bytes per block, typically 512)capacity : u64 (total capacity in bytes or sectors)flags : u32 (BLOCK_RO = bit 0, BLOCK_REMOVABLE = bit 1)health : u32Enforcement Mechanisms
| Platform | Backing | Enforcement |
|---|---|---|
Linux (fabricbiosd) | File (O_DIRECT) or raw block device | Lease lookup per request; dp_key HMAC verification |
| Pi5 bare-metal (NVMe) | NVMe SSD via PCIe1 | Per-lease dp_key + bounds check (LBA range) |
| Pi5 bare-metal (SD) | SD card via EMMC/SDHCI | Per-lease dp_key + bounds check |
| Future: NVMe-oF | nvmet kernel target | Target/session lifecycle managed on Linux; steady-state remote I/O proof still pending |
Lease Lifecycle
- LEASE_ALLOC with BLOCK type (QUIC control): Allocates a block lease. Returns binding TLVs with endpoint (IP:port for FBBU) and limits (LBA start + count).
- Data-plane I/O (FBBU on UDP 5703):
- HELLO: Presents lease_id, receives block_size, device_block_cnt, max_blocks_per_io.
- READ_BLOCK: block_index + block_count -> block_data.
- WRITE_BLOCK: block_index + block_count + block_data -> status.
- Each request carries lease_id, nonce, auth_tag.
- LEASE_FREE: Releases the block lease.
- Expiry: Same as MEM — background scan evicts expired leases.
Failure Modes
- Same status codes as FBMU (NO_LEASE, RANGE, REPLAY, INVALID, FENCED).
- BLOCK_RO: Write operations fail on read-only devices.
CPU (0x0001) — Compute
Description
CPU resources advertise compute capacity, topology, and CPU lease policy. fabricBIOS does not execute workloads on CPUs directly, but on Linux it does enforce CPU leases with topology-aware cpuset.cpus placement so higher layers such as grafOS can rely on whole-core isolation defaults.
Inventory Fields (CpuCap)
resource_id : u128cores : u16threads : u16max_mhz : u32flags : u32 (bits 16..23: arch, bits 0..15: features, bits 24..27: isolation/topology policy)health : u32cores is the number of physical core groups exported under the current CPU lease policy. threads is the number of online logical CPUs.
Architecture Flags (bits 16..23)
| Value | Constant | Architecture |
|---|---|---|
| 0x01 | ARCH_AARCH64 | AArch64 (ARM64) |
Feature Flags (bits 0..15)
| Bit | Constant | Feature |
|---|---|---|
| 0 | CPU_FEAT_TASKLET_HOST | Can host WASM/ELF tasklets |
| 1 | CPU_FEAT_NEON | ARM NEON SIMD |
| 2 | CPU_FEAT_CRC32 | CRC32 hardware acceleration |
| 3 | CPU_FEAT_TOPOLOGY_AWARE_LEASING | CPU leases are enforced with topology-aware placement |
Isolation / Topology Flags (bits 24..27)
| Bits | Constant | Meaning |
|---|---|---|
| 24..26 | CPU_ISOLATION_BEST_EFFORT | Lease allocation may split SMT siblings; density preferred over isolation |
| 24..26 | CPU_ISOLATION_WHOLE_CORE | Lease allocation uses physical-core groups and keeps SMT siblings together |
| 24..26 | CPU_ISOLATION_STRICT | Lease allocation requires single-thread-per-core groups; fails closed on SMT-sharing hosts |
| 27 | CPU_TOPOLOGY_COMPLETE | Sibling topology was detected for all online CPUs |
Enforcement Mechanisms
| Platform | Mechanism | Status |
|---|---|---|
Linux (fabricbiosd) | Topology-aware cpuset.cpus leasing with explicit isolation policy | Implemented |
| Pi5 bare-metal | CPU reservation (advertise-only) | Advertised, no enforcement yet |
Lease Lifecycle
On Linux, CPU leases reserve CPU capacity and create a per-lease cpuset cgroup. The default policy is whole_core, which allocates whole physical-core groups and prevents different leases from landing on SMT siblings of the same core. best_effort is an explicit opt-in density mode, while strict fails closed unless the host exposes single-thread-per-core groups.
If topology data is incomplete, whole_core and strict deny new CPU leases rather than silently degrading to flat logical-CPU placement. best_effort remains available as the explicit degraded policy.
GPU (0x0005) — Accelerator
Description
GPU resources advertise accelerator capacity (VRAM, compute units). On Linux, these can be backed by AMD GPUs via HIP/ROCm (behind gpu-hip feature flag). NVIDIA discovery is implemented via NVML; the broader CUDA compute path remains partial.
Inventory Fields (GpuInfo)
resource_id : u128vram_bytes : u64 (VRAM capacity in bytes)compute_units : u32 (CUDA cores / stream processors)flags : u32 (GPU_CUDA = bit 0, GPU_ROCM = bit 1)health : u32GPU Flags
| Bit | Constant | Vendor |
|---|---|---|
| 0 | GPU_CUDA | NVIDIA CUDA |
| 1 | GPU_ROCM | AMD ROCm |
Share Modes
GPU resources support two lease sharing policies, controlled by the --gpu-share-mode flag on fabricbiosd control-server:
| Mode | CLI value | Behavior | Default |
|---|---|---|---|
| Exclusive | exclusive | At most one active lease per GPU. Second LEASE_ALLOC returns CapacityExceeded. | Yes |
| Fractional | fractional | Multiple concurrent leases subject to VRAM capacity accounting. | No |
Default: exclusive. Non-partitioned GPUs (RX 6600 class) do not provide hardware isolation between tenants, so exclusive leasing is the only mode that provides strong isolation guarantees. Fractional mode is intended for future use when hardware partitioning (MIG/vGPU/SR-IOV-class semantics) is available.
When a LEASE_ALLOC is denied due to exclusive mode, the response status is CapacityExceeded (same as VRAM exhaustion). The client should free the existing lease before attempting a new allocation.
Enforcement Mechanisms
| Platform | Mechanism | Status |
|---|---|---|
Linux (fabricbiosd, gpu-hip) | HIP device submit | Implemented behind feature flag |
Linux (fabricbiosd, gpu-cuda) | NVML discovery / partial CUDA path | Discovery implemented; compute path partial |
| Pi5 bare-metal | N/A | No GPU on Pi5 |
Lease Lifecycle
GPU lease management follows the same pattern as MEM: allocate, bind, use, expire/revoke. The data-plane binding is vendor-specific (HIP/ROCm API calls). Lease expiry triggers session/context teardown.
NET (0x0004) — Network Interface
Description
Network resources advertise NIC capacity (link speed, MTU). On Linux, these can be backed by macvlan or SR-IOV virtual functions.
Inventory Fields (NicInfo)
resource_id : u128speed_bps : u64 (link speed in bits per second)mtu : u32flags : u32 (NIC_SRIOV = bit 0, NIC_LINK_UP = bit 1)health : u32NIC Flags
| Bit | Constant | Feature |
|---|---|---|
| 0 | NIC_SRIOV | SR-IOV capable |
| 1 | NIC_LINK_UP | Link is up |
Enforcement Mechanisms
| Platform | Mechanism | Status |
|---|---|---|
Linux (fabricbiosd) | macvlan / SR-IOV VF assignment | Implemented; real SR-IOV hardware validation still pending |
| Pi5 bare-metal | N/A | Single NIC, not subdivided |
Lease Lifecycle
NET leases assign a VF or macvlan interface for the lease duration on Linux. Expiry tears down the virtual interface. End-to-end SR-IOV passthrough validation on real hardware remains a future milestone.
Common Patterns
Binding TLV Format
All data-plane bindings use a common TLV encoding:
TLV_LEASE_ID (0x0101) : u128 lease identifierTLV_DP_KEY (0x0102) : [u8; 32] per-lease HMAC keyTLV_UDP_ENDPOINT (0x0103) : [u8; 16] IPv6 addr + u16 portTLV_LIMITS (0x0104) : u64 offset + u64 lengthRDMA bindings add:
TLV_RDMA_RKEY (0x0201) : u32 remote keyTLV_RDMA_REMOTE_ADDR (0x0202) : u64 remote addressTLV_RDMA_QP_NUM (0x0203) : u32 queue pair numberTLV_RDMA_GID (0x0204) : [u8; 16] GIDTLV_RDMA_PORT (0x0205) : u8 port numberFencing
All resource types follow the same fencing contract:
- Lease expires or is revoked.
- Node attempts data-plane teardown (invalidate rkey, close session, destroy QP, etc.).
- If teardown succeeds: resource is available for new leases.
- If teardown fails: resource enters FENCED state.
- No new leases granted.
- Resource reports
FENCEDflag in discovery. - Remediation (reset, power cycle) required to clear fenced state.
Auto-Detection (Linux)
When fabricbiosd control-server --auto-detect is used, the Linux platform detects:
- Memory: Total system memory from
/proc/meminfo. - CPU: Core/thread count, frequency from
/proc/cpuinfoand/sys/devices/system/cpu/. - NICs: Network interfaces from
/sys/class/net/, including speed, MTU, SR-IOV VF count. - GPUs: AMD GPUs via HIP (when
gpu-hipfeature enabled), NVIDIA GPUs via sysfs. - Block devices: From
/sys/block/, including size and read-only status.