Skip to content

Affinity Request Model

Status: design decision. Commits how typed affinity constraints are expressed on placement requests.

Builds on: docs/grafos/affinity-taxonomy.md (taxonomy + strength classes) and docs/spec/scheduler-isolation-policy.md (filter→score→adapt pipeline). Scheduler-side implementation lands as a separate wave.


1. Problem

The taxonomy doc defines 5 affinity categories and 3 strength classes, but the request model only has:

  • PlacementRequest::affinity_with: Option<NodeId> — soft, single-node
  • PlacementRequest::anti_affinity_with: Option<NodeId> — soft, single-node
  • TaskletAffinity — colocation strength (SameNode/SameRack/Any)
  • Service::anti_affinity_services: Vec<ServiceId>

There is no way for a caller to express “required resource affinity with GPU X” or “preferred data affinity with lease Y’s node” or “required anti-affinity from service Z’s failure domain.”

2. Decision: TLV-based affinity entries on the params blob

Follow the existing isolation/exclusivity precedent: carry affinity as optional TLV entries on the LeaseAllocRequest params blob. Each affinity entry is one TLV with a structured value encoding category, strength, and target.

2.1 TLV layout

tag = 0x0910 (u16 BE) — TLV_LEASE_AFFINITY
length = N (u16 BE)
value = affinity entry (variable length)

Multiple TLV_LEASE_AFFINITY entries may appear in the same params blob — one per affinity constraint. This matches how TLV streams work (scan-for-tag finds all entries, not just the first).

2.2 Affinity entry encoding

+--------+--------+--------+--------+
| category (u8) |
+--------+--------+--------+--------+
| strength (u8) |
+--------+--------+--------+--------+
| target_type (u8) |
+--------+--------+--------+--------+
| target_len (u16 BE) |
+--------+--------+--------+--------+
| target (target_len bytes) |
+--------+--------+--------+--------+

Total entry: 5 + target_len bytes.

2.3 Category encoding

ValueCategoryv1?Notes
0x01ResourceYesCo-locate with a specific resource_id
0x02StateYesCo-locate with a lease/data shard
0x03TopologyYesAnti-affinity from a failure domain
0x04TrustYesRequire attestation domain match
0x05FacilityNoDeferred (thermal/power/cooling)
0x06..0xFEreservedFail closed

2.4 Strength encoding

ValueStrengthScheduler stage
0x01RequiredFilter (hard constraint, fail-closed)
0x02PreferredScore (soft ranking boost)
0x03AdaptiveReserved for the future adapt stage

2.5 Target type encoding

ValueTarget typetarget bytesUsed with categories
0x01NodeId16 bytes (u128 BE)Resource, State, Topology
0x02ResourceId16 bytes (u128 BE)Resource
0x03LeaseId16 bytes (u128 BE)State
0x04ServiceId16 bytes (u128 BE)Topology anti-affinity
0x05TrustDomainvariable (UTF-8 string)Trust
0x06RackId4 bytes (u32 BE)Topology

2.6 Anti-affinity

Anti-affinity is not a separate category — it is expressed by combining Topology category with the appropriate target. The taxonomy doc §5.3 defines anti-affinity as “prefer/require placement away from a specific node, rack, or service’s failure domain.”

To distinguish affinity-toward from affinity-away, add a direction bit to the strength byte:

strength byte layout:
bits [0:6] = strength value (Required=0x01, Preferred=0x02, Adaptive=0x03)
bit [7] = anti-affinity flag (0 = toward, 1 = away)

So:

  • 0x01 = Required affinity (toward)
  • 0x81 = Required anti-affinity (away from)
  • 0x02 = Preferred affinity (toward)
  • 0x82 = Preferred anti-affinity (away from)

This keeps anti-affinity first-class per the taxonomy doc §13 principle without adding a separate encoding dimension.

3. Fail-closed rules

  1. Unknown category byte → reject with LeaseError::InvalidIntent.
  2. Category not in v1 set (e.g. Facility = 0x05) → reject until that category is implemented in a future phase.
  3. Unknown strength byte (after masking the anti-affinity bit) → reject.
  4. Unknown target type for the given category → reject.
  5. Required affinity that cannot be satisfied → scheduler returns empty placement (no candidates pass the filter). Rejection reason: AffinityRejection::RequiredAffinityUnsatisfiable.
  6. Preferred affinity with no matching candidates → placement proceeds with reduced score; no rejection. The caller gets placement but may not get the preferred target.

4. Interaction with existing fields

The existing PlacementRequest fields (affinity_with, anti_affinity_with, Strategy::Affinity/AntiAffinity) remain as shorthand for the most common case: preferred node-level affinity. They are equivalent to:

  • affinity_with: Some(NodeId(X)) ≡ one TLV entry with category=Resource, strength=Preferred, target=NodeId(X)
  • anti_affinity_with: Some(NodeId(X)) ≡ one TLV entry with category=Topology, strength=Preferred|Anti, target=NodeId(X)

The scheduler should normalize both forms into the same internal representation before filtering/scoring. This preserves backwards compatibility while enabling richer affinity for new callers.

TaskletAffinity (SameNode/SameRack/Any) on TaskletLeaseAllocRequest is orthogonal — it describes CPU+memory colocation strength for composite leases, not placement affinity. It stays as-is.

5. Minimal v1 surface

Per taxonomy §12, the v1 surface includes:

CategoryStrengthTarget typesExample use case
ResourceRequired/PreferredNodeId, ResourceId”Place near GPU X”
StateRequired/PreferredNodeId, LeaseId”Place on same node as lease Y”
TopologyRequired/Preferred + AntiNodeId, RackId, ServiceId”Not on same rack as service Z”
TrustRequiredTrustDomain”Only on attested nodes in domain D”

Deferred to later phases:

  • Facility category (thermal/power/cooling)
  • Adaptive strength (runtime telemetry-driven adjustment)
  • Multi-entry scoring weights (e.g. “preferred affinity A is 2x more important than preferred affinity B”)

6. Scheduler integration

The scheduler pipeline already has the right shape:

filter → score → adapt

The implementation wave adds:

  1. AffinityRequiredFilter — scans TLV entries with strength=Required, rejects candidates that don’t match. Returns AffinityRejection::RequiredAffinityUnsatisfiable when all candidates are eliminated.

  2. AffinityPreferredScorer — scans TLV entries with strength=Preferred, adds a weighted boost to candidates that match. Uses a new affinity weight dimension in ScoreWeights alongside existing fit/locality/pressure/etc.

  3. Adapt stage — reserved for Adaptive strength in a future wave. Initially a no-op.

7. SDK surface (future)

Once the wire format lands, the SDK should expose something like:

use grafos_std::cpu::{CpuBuilder, CpuIsolationClass};
use grafos_std::affinity::{Affinity, AffinityStrength, AffinityTarget};
let lease = CpuBuilder::new()
.single_core()
.isolation(CpuIsolationClass::WholeCore)
.affinity(Affinity::resource(AffinityStrength::Preferred, AffinityTarget::node(gpu_node_id)))
.anti_affinity(Affinity::topology(AffinityStrength::Required, AffinityTarget::rack(rack_42)))
.lease_secs(60)
.acquire()?;

The SDK surface is out of scope for this design note and is deferred to a separate follow-on.

8. What this note does NOT commit to

  • Implementation. TLV parser, scheduler filter/scorer, and SDK knob are all separate follow-on waves.
  • Facility category (thermal/power/cooling) — deferred.
  • Adaptive strength semantics — reserved; needs telemetry infrastructure that doesn’t exist yet.
  • Multi-entry weight tuning — deferred to a scoring-refinement wave once the basic preferred affinity works.
  • Cross-entry conflict resolution — e.g. “required affinity with node A” + “required anti-affinity from node A” → immediately rejected as contradictory. The parser should detect and reject these statically.
  • docs/grafos/affinity-taxonomy.md — canonical taxonomy this request model serves
  • docs/spec/scheduler-isolation-policy.md — shared filter→score→adapt pipeline
  • docs/spec/cpu-isolation-wire-format.md — TLV precedent on LeaseAllocRequest params blob
  • crates/grafos-scheduler/src/placement.rsPlacementRequest struct with existing affinity_with/anti_affinity_with fields
  • crates/grafos-scheduler/src/isolation_filter.rs — filter-stage pattern that AffinityRequiredFilter will follow
  • crates/fabricbios-core/src/tasklet.rs:31-48TaskletAffinity (orthogonal, colocation strength)