fabricBIOS Specification v1.1
Note: v1.1 supersedes v0 (
fabricBIOS-design-document.md). The v0 document is preserved for historical reference only.
Abstract
fabricBIOS is a minimal firmware-level control-plane specification for disaggregated computing fabrics. It enables nodes to:
- advertise hardware resources and locality,
- establish and maintain fabric trust,
- mint and validate capability tokens,
- create lease-bound bindings to existing data planes (e.g., RDMA, NVMe-oF, vendor accelerator fabrics).
fabricBIOS is not an operating system. It exposes resources and enforces safety-critical limits (authorization, anti-replay, bounded parsing, lease expiry, mandatory teardown, fencing). Scheduling, placement, fairness, paging, and global optimization live above this layer.
1. Layering and Design Principles
1.1 Layering model (informative)
A useful mental model is three layers:
- Physical / Fabric substrate (fabricBIOS)
Identity, discovery, capabilities, leases, teardown, fencing. - Fabric control and policy (higher layer)
Placement, scheduling, admission control, optimization, tenant policy, economic policy. - Workload runtime
Application logic, scaling decisions, traffic shaping, state management.
fabricBIOS exists to make the substrate explicit, verifiable, and bounded, so higher layers can safely and deterministically program disaggregated resources.
1.2 Design principles (normative intent)
- Minimal trusted computing base: small enough to audit and run in firmware/DPU, or behind a small trusted proxy.
- Mechanism, not policy: provides primitives (capabilities, leases, teardown, fencing), not scheduling/placement policy.
- Leverage existing data planes: authorize/bind; do not reimplement bulk transfer.
- Node-addressed, resource-identified: nodes have stable identities; resources have stable IDs.
- Strong security: signatures, audience binding, anti-replay, bounded parsing, least privilege.
- Lease-oriented: bindings have explicit lifetime, renewal, revocation, and mandatory teardown.
- Interop-first: unambiguous wire format, byte order, signing rules, fragmentation behavior, replay handling.
- Profiles for feasibility: multiple secure transport profiles accommodate DPUs and minimal endpoints.
2. Conformance, Terminology, and Document Conventions
2.1 Normative language
The key words MUST, MUST NOT, REQUIRED, SHALL, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL are to be interpreted as described in RFC 2119.
2.2 Normative vs informative sections
Unless explicitly labeled (informative), all requirements in this specification are normative.
2.3 Conformance levels
An implementation is conformant if it satisfies all MUST requirements in:
- Wire protocol (§5),
- Discovery (§6),
- Trust and enrollment (§8),
- Secure transports and profiles (§9),
- Capability tokens (§10),
- Lease management (§11),
- Control plane operations (§12),
- Security requirements (§14).
Discovery-only “listeners” that do not implement leases are permitted for monitoring, but they are not considered conformant fabricBIOS nodes.
3. Scope
3.1 Responsibilities
| Responsibility | Description |
|---|---|
| Discovery | Advertise node identity, locality, and resource inventory |
| Trust bootstrap | Establish fabric trust and maintain identity |
| Capability exchange | Mint, attenuate, validate, and revoke capability tokens |
| Lease management | Create, renew, revoke leases; enforce expiry |
| Data-plane binding | Provide endpoints/credentials for existing data planes |
| Safety enforcement | Rate-limit unauthenticated traffic; bound parsing; enforce teardown |
3.2 Non-responsibilities
| Not fabricBIOS | Why |
|---|---|
| CPU execution/scheduling | Policy; requires isolation, preemption, accounting |
| Memory paging | Policy; higher layers decide caching/placement |
| Filesystem semantics | Policy; fabricBIOS exposes block endpoints |
| QoS fairness | Policy; fabricBIOS may enforce safety limits only |
| Process isolation | OS/hypervisor concern |
| Global optimization | Higher-layer concern |
4. Addressing and Identifiers
4.1 Node addressing
Each node exposes a primary IPv6 address for the fabricBIOS control endpoint. Operators typically allocate a ULA prefix and assign nodes a /64; the node’s fabricBIOS endpoint is commonly ::1.
4.2 Resource identification
Resources are identified by 128-bit UUIDs carried in protocol payloads.
- UUIDs are transmitted as 16 bytes in canonical RFC 4122 byte order.
- A structured UUID layout is RECOMMENDED but not required.
5. Wire Protocol
5.1 Integer encoding and endianness
All multi-byte integer fields are big-endian (network byte order).
5.2 Canonical variable-length encodings
These encodings are normative:
bytes := u32 len + len bytes(reject iflenexceeds implementation limit).list := u16 count + repeated items(reject ifcountexceeds implementation limit).tlv := u8 type + u16 len + len bytes.- Unknown TLV types MUST be skippable (unknown TLVs are not fatal unless explicitly stated).
- Optional fields use a
u8 presentflag (0=absent,1=present) followed by the field bytes.
5.3 Common message header
All fabricBIOS messages share a common header.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Version | Msg Type | Flags |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Payload Length |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| || Request ID (64) |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| || Nonce (64) |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| [Frag Offset (32)] |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| [Frag Total Len (32)] |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Payload (variable) |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| || Signature (Ed25519, 64B) || (if SIGNED flag) |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+Field definitions:
| Field | Size | Description |
|---|---|---|
| Version | 8 bits | Protocol version. This specification defines 0x01. |
| Msg Type | 8 bits | Message type (§5.4). |
| Flags | 16 bits | Flags (§5.5). |
| Payload Length | 32 bits | Payload bytes only (excludes signature). |
| Request ID | 64 bits | Correlation ID; responses echo this. |
| Nonce | 64 bits | Anti-replay; semantics depend on flags (§5.8). |
| Frag Offset | 32 bits | Present only if FRAG_V2 is set: byte offset within full payload. |
| Frag Total Len | 32 bits | Present only if FRAG_V2 is set: total unfragmented payload size. |
| Payload | var | Type-specific payload. |
| Signature | 64B | Ed25519 signature if SIGNED is set. |
Fail-closed requirements:
- Receivers MUST reject unknown protocol versions.
- Receivers MUST reject messages with any unknown/reserved flag bit set.
- Receivers MUST reject messages whose declared lengths exceed configured bounds.
5.4 Message types
| Code | Name | Direction | Signed |
|---|---|---|---|
| 0x01 | ANNOUNCE | Node → Fabric | REQUIRED |
| 0x02 | SOLICIT | Client → Fabric/Relay | OPTIONAL (RECOMMENDED outside a single trusted L2) |
| 0x03 | WITHDRAW | Node → Fabric | REQUIRED |
| 0x10 | REQUEST | Client → Node | REQUIRED for any operation beyond discovery |
| 0x11 | RESPONSE | Node/Relay → Client | REQUIRED |
| 0x20 | REVOKE_BROADCAST | Node → Fabric | REQUIRED |
5.5 Flags
| Bit | Name | Meaning |
|---|---|---|
| 0 | SIGNED | Signature trailer present |
| 1 | COMPRESSED | Payload is zstd-compressed |
| 2 | CONTINUED | Fragmented message: more fragments follow |
| 3 | FINAL | Fragmented message: last fragment |
| 4 | NONCE_IS_TIMESTAMP | Nonce is UNIX seconds (u64). If unset, Nonce is random u64 |
| 5 | FRAG_V2 | Fragment metadata present (offset + total length) |
| 6–15 | Reserved | MUST be zero (receivers MUST reject if non-zero) |
5.6 Signature and compression rules
- If
COMPRESSEDis set, payload bytes are compressed before signing. - Receivers MUST verify the signature before attempting decompression or deep parsing.
- The signature is computed over the exact on-wire bytes of the header fields (including any fragment metadata fields when present) plus payload, excluding the signature bytes.
- If fragmented, each fragment is signed independently over that fragment’s header+payload.
5.7 Fragmentation and reassembly
Fragmentation is indicated by CONTINUED and FINAL and requires FRAG_V2 metadata.
Sending rules:
- If a message is fragmented,
FRAG_V2MUST be set on every fragment. Frag OffsetandFrag Total LenMUST be consistent across all fragments for the same reassembly key.FINALMUST be set on the fragment wherefrag_offset + payload_len == total_len.CONTINUEDMUST be set on every fragment except the final fragment.
Reassembly rules:
- Receivers MUST reassemble fragments keyed by
(sender_identity, request_id). - All fragments for a key MUST agree on:
version,msg_type,nonce, andfrag_total_len. Any mismatch MUST cause the in-progress reassembly to be dropped. - Receivers MUST reject overlapping fragments.
- Receivers MUST bound in-flight reassembly by count and memory, and drop incomplete entries after a short timeout.
Required bounds (defaults):
MAX_INFLIGHT_REASSEMBLIES(default 1024)MAX_REASSEMBLY_BYTESper entry (default 1 MiB)REASSEMBLY_TIMEOUT_SEC(default 2–5 seconds for UDP discovery workloads)
5.8 Anti-replay and nonce handling
fabricBIOS supports two nonce modes:
Timestamp mode (NONCE_IS_TIMESTAMP=1):
Nonceis UNIX seconds.- Receiver MUST reject messages outside an allowed skew window (default ±300s).
- Receiver SHOULD keep a replay cache keyed by
(sender_identity, request_id, nonce)within the skew window.
Random mode (NONCE_IS_TIMESTAMP=0):
Nonceis random u64.- Receiver MUST maintain a bounded replay cache per sender for at least
REPLAY_WINDOW_SEC(default 300s) orMAX_NONCESentries, evicting oldest.
5.9 Mandatory anti-DoS requirements
Implementations MUST:
- Rate-limit unsigned messages (default ≤10/sec per source IP).
- Perform cheap sanity checks (version, flags, length bounds) before any expensive work.
- Verify signature before decompression or deep parsing.
- Drop malformed headers without payload processing.
- Enforce maximum payload length limits for each message class.
6. Discovery
6.1 Transport
Discovery uses UDP port 5700.
6.2 Multicast groups
Well-known IPv6 multicast group IDs:
- Link-local:
ff02::6662:696f:0001 - Site-local:
ff05::6662:696f:0001 - Organization-local:
ff08::6662:696f:0001
6.3 ANNOUNCE
ANNOUNCE is sent on boot, periodically (default 30s), and on material resource change. ANNOUNCE MUST be signed.
ANNOUNCE payload:
node_id : u128node_addr : 16B IPv6fabric_id : u64sequence : u64 (monotonic per node)locality : LocalityInfoattestation : AttestationTLV (optional)resources : ResourceSummary[]features : u32 (optional; presence-gated)LocalityInfo:
rack_id : u32row_id : u32site_id : u32geo_hash : u64 (optional; presence-gated)custom : 32B operator-definedFeature advertisement (optional):
If present, features is a bitmask advertising protocol feature support (e.g., compression, fragmentation requirements). Unknown bits are ignored by receivers.
6.4 WITHDRAW
WITHDRAW announces node departure or permanent unavailability. It MUST be signed.
WITHDRAW payload:
node_id : u128sequence : u64reason : u166.5 SOLICIT
SOLICIT queries for resources. Signing SOLICIT is RECOMMENDED outside a single trusted L2 domain.
SOLICIT payload:
query_type : u8 (0=all, 1=by_type, 2=by_node, 3=by_locality)filters : Filter[]Filter:
field : u8op : u8 (EQ, GT, LT, CONTAINS)value : 32B6.5.1 Filter field registry
Filter field IDs are u8 with ranges:
0x00reserved0x01..=0x3Fcore registry0x40..=0x7Fexperimental/extension0x80..=0xFFvendor-specific
Core fields:
| Field | Name | Value encoding | Ops |
|---|---|---|---|
| 0x01 | RESOURCE_TYPE | value[0..2] = u16 BE | EQ |
| 0x02 | NODE_ID | value[0..16] = u128 BE | EQ |
| 0x03 | SITE_ID | value[0..4] = u32 BE | EQ, GT, LT |
| 0x04 | ROW_ID | value[0..4] = u32 BE | EQ, GT, LT |
| 0x05 | RACK_ID | value[0..4] = u32 BE | EQ, GT, LT |
| 0x06 | LOCALITY_CUSTOM | value[0..32] opaque | EQ, CONTAINS |
| 0x07 | RESOURCE_FLAGS | value[0..2] = u16 BE bitmask | EQ, CONTAINS |
CONTAINS for LOCALITY_CUSTOM uses non-zero bytes as required matches. RESOURCE_FLAGS uses a bitmask; CONTAINS requires all bits set.
6.6 Discovery relay
In routed fabrics, multicast may be unavailable or unreliable; relays are commonly required.
- Nodes periodically unicast ANNOUNCE to the relay.
- Clients unicast SOLICIT to the relay.
- The relay responds with RESPONSE messages containing aggregated ANNOUNCE payloads (may be chunked).
6.6.1 Relay discovery profile (RESPONSE payload)
Relays encode inventory responses using MsgType::RESPONSE with this fixed payload layout:
count : u16repeated count times:announce_payload_bytes : bytes (u32 length + bytes of ANNOUNCE payload encoding)Rules:
- Each list entry MUST be a complete ANNOUNCE payload encoded per this specification.
- Relays MAY split inventory across multiple RESPONSE frames to honor MTU guidance (§6.7).
- Clients MUST treat the list as an unordered snapshot.
6.6.2 Relay verification requirements
- Relays MUST enforce nonce/replay protections.
- If configured with trust material, relays MUST reject unsigned or invalidly signed frames.
6.6.3 Relay limits and defaults
Relays MUST bound caches and rate-limit unauthenticated traffic. Recommended defaults:
max_nodes_cached=4096max_inventory_bytes=1MiBreplay_max_entries=4096unsigned_per_sender_per_sec=10sig_verifies_per_sec=2000
6.7 Chunking and MTU guidance
To avoid IP fragmentation, implementations SHOULD limit UDP payloads to ≤1200 bytes. For larger inventories/responses:
- Use fragmentation flags and
FRAG_V2metadata (§5.7). - Receivers MUST apply bounded reassembly behavior (§5.7).
7. Resource Model
7.1 ResourceSummary
resource_id : u128type : u16flags : u16capacity : u64available : u64descriptors : DescriptorTLV[]endpoints : EndpointTLV[] (optional; may be omitted and fetched via GET_INVENTORY)7.2 Resource types
| Code | Type | Capacity unit | Notes |
|---|---|---|---|
| 0x0001 | CPU | Core count | Advertised only; no remote exec defined here |
| 0x0002 | MEM | Bytes | RDMA-accessible memory pool |
| 0x0003 | GPU | Compute units | Vendor fabric / RDMA-based |
| 0x0004 | NVME | Bytes | Block storage via NVMe-oF |
| 0x0005 | FPGA | Slots | Vendor-defined binding |
| 0x0006 | PMEM | Bytes | Persistent memory pool |
| 0x0007 | CXL_MEM | Bytes | CXL memory pooling/switch binding (extension) |
| 0x00FF | VENDOR | Vendor-defined | Extension space |
7.3 Resource flags
Resource flags are a u16 bitmask. Core flags and meanings are defined in Appendix B (placeholder).
8. Trust, Identity, and Enrollment
8.1 Node identity
NodeIdentity:
node_id : u128public_key : 32B (Ed25519)certificate : Certificate (optional until enrolled)Certificate:
subject : u128 (node_id)issuer : u128 (CA id)issued_at : u64expires_at : u64public_key : 32Bextensions : ExtensionTLV[]signature : 64B (Ed25519 by issuer)Canonical identifier string forms:
- Hex string:
0x+ 32 lowercase hex digits (zero-padded). - URN string:
urn:fabricbios:node:0x<32-hex>.
8.2 Trust anchors and controller discovery
Enrollment requires locating and trusting the fabric controller before a fabric certificate exists. Operators choose one:
A) Pinned Fabric CA (RECOMMENDED): node firmware contains fabric CA public key (or hash).
B) Pinned Controller Key (RECOMMENDED for small fabrics): node firmware contains controller public key (or hash).
C) TOFU (OPTIONAL, must be explicitly enabled): node pins controller key on first contact; subsequent enrollment requires same key.
8.3 Enrollment state machine
Nodes implement the following states:
- UNENROLLED: no fabric identity; may emit limited discovery with manufacturer identity only if allowed by policy.
- ENROLLING: establishing trust with controller; generating or presenting node keys; requesting certificate.
- ENROLLED: holds valid fabric certificate and uses it for all signed messages.
- LOCKED: refuses enrollment changes and key material changes except via explicit local operator action.
Required behavior:
- Nodes MUST NOT accept enrollment material from an unauthenticated controller per the configured trust anchor.
- Nodes MUST fail closed if controller identity is missing or ambiguous under the chosen trust mode.
- In LOCKED, nodes MUST reject remote attempts to reset, rotate, or re-enroll.
8.4 Enrollment protocol (minimum)
Enrollment is realized as authenticated control-plane operations over a secure transport (§9). At minimum, the controller MUST support:
- issuing a certificate binding
node_idtopublic_key, - renewing/replacing certificates,
- revoking certificates (operator-driven).
Exact opcodes and payloads are listed in Appendix B (placeholders included).
8.5 Attestation (optional)
Attestation is carried as a TLV:
type : u8evidence : bytesAttestation type codes are listed in Appendix B (placeholders included).
9. Secure Transports and Profiles
Lease management operations require an authenticated reliable transport. Default policy (2026-02-19): QUIC/TLS 1.3 is the standard control-plane transport for general-purpose nodes (including Pi-class bare metal and Linux nodes). Legacy profiles remain documented for migration only.
9.1 Profile FULL
- Discovery: UDP 5700
- Control + lease ops: QUIC on 5701 with TLS 1.3 (mutual authentication)
Requirements:
- TLS version MUST be 1.3.
- Peer identity MUST map to a fabric node/service identity (e.g., SAN URI form
urn:fabricbios:node:...or equivalent mapping defined by the deployment). - Implementations SHOULD support session resumption and 0-RTT only if anti-replay is preserved for idempotent operations.
9.2 Profile COMPAT (legacy migration only)
- Discovery: UDP 5700
- Control + lease ops: TCP 5701 with TLS 1.3 (mutual authentication)
- Status: retained only for migration/backward compatibility; SHOULD NOT be used for new deployments.
Requirements: same as FULL, except transport is TCP+TLS.
9.3 Profile PROXIED (constrained bring-up only)
- Firmware supports signed discovery and a minimal local interface to a trusted on-node proxy.
- Proxy terminates QUIC/TLS or TCP/TLS and performs lease ops and device programming.
- Status: implementation aid for constrained bring-up; SHOULD NOT be the steady-state profile for Pi/Linux nodes.
Requirements:
- The proxy MUST be within the trusted computing boundary and MUST enforce the same token, lease, and teardown rules as a native implementation.
- The proxy MUST preserve auditability (log identity, token IDs, lease IDs, and op results).
9.4 Normative requirement
ALLOC/BIND/RENEW/FREE and explicit lease revocation operations MUST use an authenticated reliable transport, satisfied by FULL, COMPAT, or PROXIED.
General-purpose nodes SHOULD implement FULL directly.
UDP is acceptable only for discovery and strictly limited, low-risk control.
10. Capability Tokens
10.1 Token structure
version : u8token_id : u128resource_id : u128audience : u128 (node_id or service_id)permissions : u32issued_at : u64expires_at : u64 (default max TTL 300s)issuer : u128 (issuer identity)caveats : CaveatTLV[]signature : 64B (Ed25519 by issuer)10.2 Audience binding (required)
A token is valid only when presented by its audience.
- Over QUIC/TLS or TCP/TLS: audience binding is satisfied by the authenticated peer identity.
- Over UDP: REQUEST MUST include presenter proof (§12.3).
audience = 0 indicates a bearer token and MUST be restricted to short TTL and SHOULD include SOURCE_IP caveats.
10.3 Permissions
| Bit | Name |
|---|---|
| 0 | READ |
| 1 | WRITE |
| 2 | ADMIN |
| 3 | DELEGATE |
| 4 | EXCLUSIVE |
Reserved bits MUST be zero.
10.4 Caveats
Caveat TLV:
type : u8length : u16data : bytesCaveat types and their exact numeric assignments are listed in Appendix B (placeholders included). The following caveat semantics are defined:
- TIME_BOUND: restricts validity window within token lifetime.
- SOURCE_IP: restricts source IP(s) allowed to present.
- RANGE: restricts byte ranges for MEM/NVME.
- RATE_LIMIT: safety limits a node MAY enforce.
- DEPTH: maximum delegation depth.
- AUDIENCE: additional audience constraint.
10.5 Delegation (attenuation)
Derived tokens can only restrict:
- add caveats,
- narrow permissions,
- set a new audience.
Verifier checks parent validity, DELEGATE permission, delegator identity, and restriction-only semantics.
10.6 Token revocation
- Tokens SHOULD have short TTL (default max 300s).
- Immediate revocation can be broadcast:
REVOKE_BROADCAST payload:
issuer : u128token_ids : u128[]until : u64Receivers cache revocations until until and reject revoked token_ids.
Relationship to explicit lease revocation: REVOKE_BROADCAST distributes token revocations best-effort; it is not a guaranteed immediate teardown mechanism for a specific lease. For deterministic “recall now” behavior, use explicit lease revocation operations (§11.5).
11. Lease Management
11.1 Lease model
Any data-plane binding creates a lease. A token authorizes control operations; a lease governs data-plane lifetime.
Lease:
lease_id : u128resource_id : u128holder : u128 (node_id/service_id)granted_at : u64expires_at : u64binding : DataPlaneBinding11.2 Timing defaults
- duration: 60s (range 10s–3600s)
- renewal window: last 20%
- grace: 10s (range 0–60s)
11.3 Teardown (required)
On expiry or revoke:
- fabricBIOS MUST tear down data-plane authorization such that subsequent data-plane operations fail.
Examples:
- RDMA: invalidate/rotate keys and/or destroy/poison QPs; deregister memory
- NVMe-oF: disconnect controller session; revoke auth material
- Accelerator fabrics: revoke endpoint/session credentials
- CXL: remove/disable mapping window or decoder rule (platform-specific)
11.4 Teardown failure and fenced state
If teardown fails or hardware enters an unsafe state, fabricBIOS MUST fence the resource:
- No new leases are granted for that resource.
- Existing leases remain invalid at the control plane.
- Resource is reported as FENCED/DEGRADED in discovery.
- Implementations SHOULD attempt remediation if supported.
11.5 Explicit lease revocation (emergency recall)
fabricBIOS supports early termination of an active lease prior to expiry. Two operations are defined:
- LEASE_REVOKE: initiate teardown and return once teardown has been scheduled/initiated.
- LEASE_REVOKE_SYNC: initiate teardown and block until teardown completes successfully, or the resource is fenced, or a caller-provided deadline is reached.
Normative guarantees:
- After
LEASE_REVOKE_SYNCreturnsOK, the node MUST guarantee the associated data-plane authorization has been revoked such that subsequent data-plane operations fail. - If teardown fails, the node MUST fence the resource and return
RESOURCE_FENCED. - If teardown cannot be confirmed within the caller’s deadline, the node MUST either:
- return
TEARDOWN_TIMEOUTand guarantee teardown remains in progress with a bounded watchdog that will eventually transition the resource to either “torn down” orFENCED, or - fence immediately and return
RESOURCE_FENCED.
- return
Transport restriction:
LEASE_REVOKEandLEASE_REVOKE_SYNCMUST be supported only over authenticated reliable transports (§9). They MUST NOT be accepted over UDP.
Auditability:
- Implementations MUST emit an audit record for explicit lease revocation including: actor identity, lease_id, resource_id (if known), outcome, and time-to-teardown.
12. Control Plane Operations
12.1 Transports and ports
- UDP 5700: discovery + strictly limited control
- QUIC 5701: control + lease management (required default)
- TCP 5701: legacy migration path only (COMPAT profile)
Operations that create/renew/free/revoke leases MUST use the authenticated reliable transport profiles.
12.2 REQUEST / RESPONSE payloads
Normative wire encoding for REQUEST/RESPONSE is defined in docs/spec/fabricbios-wire-encoding-v0.md. The summary below is informative.
REQUEST payload:
op : u16resource_id : u128token : CapabilityTokenparams : bytespresenter_id : u128 (REQUIRED on UDP; OPTIONAL on QUIC — peer identity is TLS-authenticated)presenter_sig : 64B (REQUIRED on UDP; OPTIONAL on QUIC — peer identity is TLS-authenticated)RESPONSE payload:
status : u16op : u16 (echo)result : bytesNotes:
- For operations that act on a lease by
lease_id(e.g.,LEASE_REVOKE_SYNC),resource_idSHOULD be set to zero and MUST be ignored by the receiver.
12.3 Presenter proof on UDP (required for UDP requests)
On UDP, presenter_sig proves possession of presenter_id private key and binds the request to the token.
Canonical signing procedure (normative):
- Serialize the REQUEST payload exactly as it appears on wire.
- Set the 64 bytes of
presenter_sigin that serialized payload to all-zero bytes. - Compute
presenter_sig = Ed25519.Sign(presenter_sk, serialized_request_payload_with_zeroed_sig). - Insert
presenter_siginto the payload and transmit.
Verifiers MUST repeat the same zeroing procedure and verify the signature with presenter_id’s public key.
12.4 Status codes
| Code | Name |
|---|---|
| 0x0000 | OK |
| 0x0001 | INVALID_TOKEN |
| 0x0002 | INSUFFICIENT_PERM |
| 0x0003 | RESOURCE_NOT_FOUND |
| 0x0004 | RESOURCE_BUSY |
| 0x0005 | CAPACITY_EXCEEDED |
| 0x0006 | LEASE_EXPIRED |
| 0x0007 | RATE_LIMITED |
| 0x0008 | RESOURCE_FENCED |
| 0x0009 | TEARDOWN_TIMEOUT |
| 0x000A | LEASE_NOT_FOUND |
| 0x00FF | INTERNAL_ERROR |
TEARDOWN_TIMEOUT is only valid for LEASE_REVOKE_SYNC and MUST satisfy the guarantee in §11.5.
12.5 Operation model
Operations are identified by a u16 opcode. Exact opcodes and parameter/result encodings are listed in Appendix B (placeholders included).
At minimum, conformant nodes MUST support:
PINGGET_INVENTORY- token mint/refresh/revoke
- resource-specific ALLOC/BIND/RENEW/FREE for supported resource types
LEASE_REVOKEandLEASE_REVOKE_SYNC
12.6 Explicit lease revocation operation payloads (normative encodings)
The following payload formats are normative. Opcode numeric values are assigned in Appendix B (placeholder).
12.6.1 LEASE_REVOKE params
lease_id : u128reason : u16flags : u16Flags (bitmask):
- bit 0:
RETURN_BINDING_INFO(if set, include binding info fields in result when available) - bit 1:
CANCEL_RENEWALS(if set, node MUST reject further renewals for this lease_id) - bits 2..15: reserved, MUST be zero
12.6.2 LEASE_REVOKE result
outcome : u8resource_id_present : u8resource_id : u128 (if present)binding_info_present : u8binding_kind : u16 (if present)binding_id : u128 (if present)reserved0 : u8 (must be 0; for alignment)Outcome codes:
- 0:
REVOKED(teardown initiated; may still be in progress) - 1:
ALREADY_EXPIRED(lease already expired; treated as success) - 2:
NOT_FOUND(lease unknown; receiver MAY treat as success for idempotency) - 3:
FENCED(resource fenced due to teardown failure)
12.6.3 LEASE_REVOKE_SYNC params
lease_id : u128reason : u16flags : u16deadline_ms : u32deadline_msMUST be non-zero.- If
deadline_msexceeds implementation maximum, receiver MUST clamp to its maximum and proceed.
Flags are the same as LEASE_REVOKE.
12.6.4 LEASE_REVOKE_SYNC result
Same as LEASE_REVOKE result, except outcome code 0 (REVOKED) means teardown is complete and authorization is revoked.
If the operation cannot confirm teardown before the deadline, it MUST return TEARDOWN_TIMEOUT status and MAY include best-effort resource_id/binding info.
13. Data-Plane Bindings (examples)
fabricBIOS returns binding credentials and endpoint descriptors. It does not implement bulk transfer.
13.1 RDMA binding (example)
transport : u8gid : 16Bqp_type : u8qp_num : u32psn : u32mtu : u16rkey : u32remote_addr : u64length : u64vendor_data : bytes (opaque)13.2 NVMe-oF binding (example)
transport : u8address : 16B IPv6port : u16nqn : bytescontroller_id : u16namespace_id : u32auth_key : bytes (optional)13.3 Accelerator binding (vendor)
Vendor protocol endpoint + metadata blob.
13.4 CXL binding (extension)
Platform-specific mapping and authentication material; exact format TBD by platform standards.
14. Security Considerations
14.1 Threats and mitigations
- Spoofed discovery → signed frames and trust verification
- Token forgery → Ed25519 signatures
- Replay → nonce checks + bounded replay caches
- Stale data-plane access → lease expiry + teardown + explicit revoke sync
- MITM control plane → authenticated reliable transports for leases
- Decompression bombs → verify signature before decompression + bounds
- Memory abuse via fragmentation → bounded reassembly + overlap rejection
- Control-plane recall abuse → rate-limit
LEASE_REVOKE_SYNCand require ADMIN authorization
14.2 Mandatory requirements summary
A conformant implementation MUST:
- Sign ANNOUNCE/WITHDRAW/REVOKE_BROADCAST and all control responses.
- Verify tokens, audience binding, and expiry on control operations.
- Verify signatures before decompression and deep parsing.
- Enforce nonce validity and replay cache behavior.
- Enforce lease expiry teardown; fence on teardown failure.
- Support explicit lease revocation including a synchronous form (
LEASE_REVOKE_SYNC) that provides deterministic teardown-or-fence behavior. - Reject unknown versions and unknown/reserved flag bits.
- Implement one secure transport profile (FULL is the normative default; COMPAT/PROXIED only where constrained).
15. Operational and Performance Considerations (informative)
- Bounded work on unauthenticated input is non-negotiable: length checks, rate limits, and early drops protect firmware-class devices.
- Transition costs exist in real systems: renegotiating leases, cache refill, key rotation, and re-binding can dominate short timescales.
- Measurement overhead (“observer effect”) can distort latency and throughput if telemetry is too intrusive; prefer low-overhead counters and sampling.
- Tuning knobs commonly required: replay cache sizes, signature verify budgets, reassembly limits, inventory chunking policies, and explicit revoke rate limits.
16. Interoperability and Compliance (informative but strongly recommended)
To avoid interop drift, deployments SHOULD maintain:
- Golden wire vectors for each message type (including fragmentation cases, compressed payloads, and invalid frames).
- Negative test corpus (bad flags, bad versions, oversized lengths, replayed nonces, overlap fragments).
- Fuzzing harness focusing on: header parsing, TLV parsing, reassembly, decompression, and token verification.
- Conformance checklist aligned to §14.2.
- Lease recall tests covering:
LEASE_REVOKE_SYNC OK,LEASE_REVOKE_SYNC -> RESOURCE_FENCED, andLEASE_REVOKE_SYNC -> TEARDOWN_TIMEOUT.
Appendix A: Constants
Ports
- 5700/UDP: discovery + limited control
- 5701/QUIC: secure control (FULL profile, normative default)
- 5701/TCP: legacy migration only (COMPAT profile)
Multicast groups
ff02::6662:696f:0001(link-local)ff05::6662:696f:0001(site-local)ff08::6662:696f:0001(org-local)
Appendix B: Registries and Numeric Assignments (PLACEHOLDER — update before interoperability commitments)
This appendix is intentionally explicit about the remaining “registry work” and exact numeric assignments that must be finalized. The ranges and some assignments below are placeholders; finalize them and keep them stable.
B.1 TLV type ranges (u8)
Recommended ranges:
0x00reserved0x01..=0x3Fcore TLVs (this specification)0x40..=0x7Fexperimental/extension0x80..=0xFFvendor-specific
B.2 Caveat type registry (u8) — PLACEHOLDER
| Type code | Caveat name | Data encoding (normative once finalized) |
|---|---|---|
| 0x01 | TIME_BOUND | u64 not_before + u64 not_after |
| 0x02 | SOURCE_IP | u8 family + addr bytes + optional prefix |
| 0x03 | RANGE | u64 offset + u64 length |
| 0x04 | RATE_LIMIT | u32 units_per_sec + u32 burst |
| 0x05 | DEPTH | u8 max_depth |
| 0x06 | AUDIENCE | u128 required_audience |
| 0x07..0x3F | (reserved) | — |
B.3 DescriptorTLV registry (u8) — PLACEHOLDER
DescriptorTLV entries appear in ResourceSummary.descriptors.
| Type code | Descriptor name | Data encoding (placeholder) |
|---|---|---|
| 0x01 | NAME | UTF-8 string bytes |
| 0x02 | MODEL | UTF-8 string bytes |
| 0x03 | SERIAL | UTF-8 string bytes |
| 0x04 | FW_VERSION | UTF-8 string bytes |
| 0x05 | CAPABILITIES | u64 bitmask (resource-type specific) |
| 0x06..0x3F | (reserved) | — |
B.4 EndpointTLV registry (u8) — PLACEHOLDER
EndpointTLV entries appear in ResourceSummary.endpoints and/or may be returned via GET_INVENTORY.
| Type code | Endpoint name | Data encoding (placeholder) |
|---|---|---|
| 0x01 | ENDPOINT_RDMA | RDMA endpoint blob (typed fields or opaque bytes) |
| 0x02 | ENDPOINT_NVME | NVMe-oF endpoint blob |
| 0x03 | ENDPOINT_ACCEL | Vendor endpoint blob |
| 0x04 | ENDPOINT_CXL | CXL endpoint blob |
| 0x05 | ENDPOINT_OPAQUE | Opaque bytes for vendor/private use |
| 0x06..0x3F | (reserved) | — |
B.5 Certificate ExtensionTLV registry (u8) — PLACEHOLDER
| Type code | Extension name | Data encoding (placeholder) |
|---|---|---|
| 0x01 | ROLE | u32 role bitmask |
| 0x02 | ALLOWED_PREFIXES | list of IPv6 prefixes |
| 0x03 | HW_ATTEST_POLICY | opaque bytes |
| 0x04..0x3F | (reserved) | — |
B.6 Attestation type registry (u8) — PLACEHOLDER
| Type code | Attestation type |
|---|---|
| 0x00 | NONE |
| 0x01 | TPM2 |
| 0x02 | SGX |
| 0x03 | SEV |
| 0x04 | TDX |
| 0x05..0xFF | (reserved) |
B.7 Resource flag bits (u16) — PLACEHOLDER
| Bit | Name | Meaning |
|---|---|---|
| 0 | FENCED | Resource is fenced; no new leases granted |
| 1 | DEGRADED | Resource is degraded; higher layers should avoid if possible |
| 2 | MAINT | Resource under maintenance |
| 3..15 | (reserved) | — |
B.8 Operation opcode registry (u16) — PLACEHOLDER
This table is a placeholder map. Finalize the opcodes and define parameter/result encodings per opcode.
| Opcode | Name | resource_id | Params (placeholder) | Result (placeholder) |
|---|---|---|---|---|
| 0x0001 | PING | 0 | empty | u64 uptime_sec |
| 0x0002 | GET_INVENTORY | 0 | optional filters | inventory bytes (chunkable) |
| 0x0100 | CAP_REQUEST | u128 | requested perms/ttl/caveats/audience | token bytes |
| 0x0101 | CAP_REFRESH | u128 | token_id | token bytes |
| 0x0102 | CAP_REVOKE | u128 | token_id | empty |
| 0x0200 | LEASE_ALLOC | u128 | size/constraints | lease + RDMA binding |
| 0x0201 | LEASE_FREE | u128 | lease_id | empty |
| 0x0202 | LEASE_RENEW | u128 | lease_id + ttl | updated lease |
| 0x0300 | NVME_BIND | u128 | size/constraints | lease + NVMe binding |
| 0x0301 | NVME_UNBIND | u128 | lease_id | empty |
| 0x0302 | NVME_RENEW | u128 | lease_id + ttl | updated lease |
| 0x0400 | LEASE_REVOKE | 0 | §12.6.1 | §12.6.2 |
| 0x0401 | LEASE_REVOKE_SYNC | 0 | §12.6.3 | §12.6.4 |
| 0x0500.. | (reserved) | — | — | — |
B.9 Enrollment opcodes (u16) — PLACEHOLDER
| Opcode | Name | Params (placeholder) | Result (placeholder) |
|---|---|---|---|
| 0x1000 | ENROLL_REQUEST | CSR-like blob | pending/nonce |
| 0x1001 | ENROLL_ISSUE | node_id + pubkey + policy | certificate bytes |
| 0x1002 | ENROLL_ROTATE | new pubkey/CSR | certificate bytes |
| 0x1003 | ENROLL_REVOKE | node_id | empty |
| 0x1004 | ENROLL_LOCK | mode | empty |
| 0x1005 | ENROLL_RESET | reason | empty |
Appendix C: Implementation Guidance (informative)
C.1 Size expectations (typical)
| Platform | Typical size |
|---|---|
| Reference daemon | 1–5 MB |
| DPU firmware | 200 KB–1 MB |
| Minimal embedded | 100–500 KB |
C.2 Deployment targets
Linux daemon (dev/test), DPU (primary), BMC (FULL preferred; PROXIED where constrained), embedded controllers (PROXIED).
C.3 Suggested bringup milestones
- ANNOUNCE/SOLICIT + relay discovery profile
- Token mint/refresh/revoke with one caveat type
- One lease type (e.g., LEASE_ALLOC) with correct teardown and fencing
LEASE_REVOKE_SYNCbehavior: success, timeout, and fenced outcomes- Renewal + expiry teardown under fault injection
- Additional bindings (e.g., NVME_BIND), then vendor accelerator bindings