Skip to content

Recipe 57: Per-Project Fair-Share Policy

Situation

A tenant runs three projects under one tenant account:

  • Project A — customer-facing inference service.
  • Project B — research team’s experimental workloads.
  • Project C — fleet-wide observability sidecars.

The team needs the experimental work to never starve customer inference, but inference also shouldn’t consume the tenant’s whole capacity budget when nothing else is running. Observability must always get at least a baseline. A flat tenant-wide quota is too coarse — the same budget gets eaten by whichever project ran first.

In grafOS, a WeightedFairSharePolicy expresses this directly. Each project gets a relative weight (its share when everything competes), plus a min_share_secs floor (always at least this many capacity- seconds in a rolling window) and a max_share_secs ceiling (never exceed this, even when other projects are idle). The same policy type also supports PriorityClass entries so you can layer the priority surface (guaranteed, standard, scavenger) on top.

What You Build

A typed builder + validator that:

  • Constructs a WeightedFairSharePolicy from ProjectShare and PriorityShare entries (typed so callers don’t accidentally bind a priority weight to a project ID);
  • Fails fast on every invalid policy shape — zero window, empty table, zero weight, inverted floor/ceiling, duplicate scope;
  • Surfaces a preview helper that shows the fractional share each scope holds against the total weight, so policy authors can sanity-check a table before committing it.

The compiled recipe lives in cookbook/recipe-57-per-project-fair-share.

Core grafOS API Path

use grafos_core::{
EconomicsGeneration, FairShareScope, FairShareWeight, Priority,
WeightedFairSharePolicy,
};
let policy = WeightedFairSharePolicy {
window_secs: 3600,
generation: EconomicsGeneration(1),
entries: vec![
FairShareWeight {
scope: FairShareScope::Project { project_id: 0xa1 },
weight: 70,
min_share_secs: 1000,
max_share_secs: 3000,
},
FairShareWeight {
scope: FairShareScope::Project { project_id: 0xa2 },
weight: 20,
min_share_secs: 100,
max_share_secs: 3000,
},
FairShareWeight {
scope: FairShareScope::Project { project_id: 0xa3 },
weight: 10,
min_share_secs: 200,
max_share_secs: 500,
},
FairShareWeight {
scope: FairShareScope::PriorityClass {
priority: Priority::Guaranteed,
},
weight: 60,
min_share_secs: 0,
max_share_secs: 3600,
},
],
};
assert!(policy.is_valid());
assert_eq!(policy.total_weight(), 70 + 20 + 10 + 60);

Program

use cookbook_recipe_57_per_project_fair_share::{
build_policy, project_shares, PriorityShare, ProjectShare,
};
use grafos_core::{EconomicsGeneration, Priority};
let policy = build_policy(
3600,
EconomicsGeneration(1),
&[
ProjectShare {
project_id: 0xa1,
weight: 70,
min_share_secs: 1000,
max_share_secs: 3000,
},
ProjectShare {
project_id: 0xa2,
weight: 30,
min_share_secs: 100,
max_share_secs: 3000,
},
],
&[PriorityShare {
priority: Priority::Guaranteed,
weight: 60,
min_share_secs: 0,
max_share_secs: 3600,
}],
)?;
let shares = project_shares(&policy);
for s in &shares {
println!("{:?}: weight={} fraction={:.3}", s.scope, s.weight, s.fraction);
}
# Ok::<(), cookbook_recipe_57_per_project_fair_share::PolicyError>(())

Design

The recipe layers three checks before a policy commits:

  1. window_secs > 0 — fast-fail before iterating entries. A zero window makes every floor/ceiling meaningless.
  2. At least one entry — empty policy is rejected so the scheduler doesn’t silently accept a no-op table.
  3. WeightedFairSharePolicy::is_valid() — the authoritative check at the grafos-core layer. This rejects:
    • Zero weight on any entry (every scope must have a positive share).
    • Inverted floor/ceiling (min_share_secs > max_share_secs).
    • Duplicate scope — two entries with the same FairShareScope (same project_id, same priority, or same tenant_id). The table is keyed on scope, not on entry order, so duplicates would be silent-overwrite at the scheduler.

FairShareScope has three variants: Tenant, Project, and PriorityClass. The recipe surfaces typed builders only for Project and PriorityClass because tenant-wide entries are usually managed at a layer above per-tenant policy (the scheduler operator’s domain). Callers that need a tenant-scope entry build a FairShareWeight directly with FairShareScope::Tenant { tenant_id }.

The fractional share preview (project_shares) shows what the scheduler would compute as relative-share input. It is NOT an admission decision — actual admission ordering also depends on live usage (FairShareUsage snapshots, the min_share_secs floor, and the max_share_secs ceiling). The preview is for policy authors validating their table before commit.

Failure Modes

  • Zero window: PolicyError::ZeroWindow.
  • Empty table: PolicyError::Empty.
  • Zero weight: PolicyError::Invalid (the underlying WeightedFairSharePolicy::is_valid() returns false).
  • Inverted floor/ceiling: PolicyError::Invalid.
  • Duplicate scope (same project_id appearing twice, same priority class appearing twice): PolicyError::Invalid. Note this is per-scope-variant — a project entry and a priority entry are independent scopes even if both might apply to the same workload.
  • Generation collision: out-of-scope for this recipe. The EconomicsGeneration field gates the scheduler’s freshness-check; producers must monotonically bump it on every committed table.

Tests

Run it with:

Terminal window
cargo test -p cookbook-recipe-57-per-project-fair-share

Eight tests cover the happy path (3 projects + 2 priority classes in one table), share-fraction summing to 1.0, duplicate-project rejection, duplicate-priority rejection, zero-window fast-fail, inverted floor/ceiling rejection, zero-weight rejection, and the empty-policy fast-fail.

Adaptation Notes

  • Floors vs ceilings: min_share_secs is a soft guarantee the scheduler honors when work is available in that scope. max_share_secs is a hard ceiling the scheduler will not exceed even when other scopes are idle. If you want a scope to always get its share even if idle (reserved capacity), use min_share_secs > 0; if you want a scope that can burst but never dominate (a noisy-neighbor cap), use a tight max_share_secs.
  • Generation discipline: every committed policy carries an EconomicsGeneration counter. Bump it monotonically on every commit so the scheduler can drop stale tables on the freshness-check path. Two tables with the same generation are ambiguous.
  • Cross-scope composition: Tenant, Project, and PriorityClass scopes coexist in the same table. The scheduler evaluates each scope independently — a request that belongs to project A AND priority class Standard counts against both shares. Plan the weights with that in mind.
  • Live usage vs preview: project_shares shows the static weight fractions. Actual scheduling ordering also depends on live FairShareUsage. If a project has consumed 100% of its floor in the current window, the scheduler discounts it relative to projects below their floor — even if its weight is higher.

See also:

  • crates/grafos-core/src/policy_vocab.rsFairShareScope, FairShareWeight, WeightedFairSharePolicy, Priority, EconomicsGeneration.
  • crates/grafos-scheduler/src/quota.rsevaluate_fair_share_admission, rank_fair_share_scopes, FairShareAdmissionDecision, FairShareUsage.
  • docs/operations/scheduler-features.md § “Weighted fair-share policy”.
  • docs/operations/siem-vocabulary-cookbook.md — log filters keyed on fair_share_scope == "project".