Recipe 54: Lease-Scoped Data Clean Room
Situation
Two teams want to run a shared analytics query over sensitive data, but neither team should receive durable, ambient access to the dataset. Access should be explicit, time-bound, audience-bound, and revocable.
In grafOS, a dataset can live in fabric object storage and each collaborator receives a short capability grant scoped to the dataset resource. The clean-room service validates that grant before reading the dataset and records the query effect idempotently.
What You Build
A clean-room query service that:
- stores dataset bytes through the real
ObjectStoreAPI; - issues a short capability grant through the grafOS authority boundary;
- validates audience, expiry, permission, and revocation state;
- runs a simple aggregate over the dataset;
- records query completion in
ReplicatedIdempotencyStore.
The compiled recipe lives in
cookbook/recipe-54-lease-scoped-data-clean-room.
Core grafOS API Path
The clean-room facade is intentionally thin: data is stored through the
ObjectStore API, access is authorized with grafos-std capability grants,
and query effects are recorded in ReplicatedIdempotencyStore.
use grafos_replicated::{ FenceEpoch, IdempotencyKey, IdempotencyOutcome, LogicalResourceName, OperationHash, ReplicatedIdempotencyStore, ResourceKind, SchemaId,};use grafos_std::capability::{ CapabilityAuthority, CapabilityPermission, CapabilityRequest, RuntimeCapabilityAuthority,};use grafos_store::{FabricUri, MemObjectStore, ObjectStore};use cookbook_recipe_54_lease_scoped_data_clean_room::{ CleanRoomError, CleanRoomOutcome,};
let mut store = MemObjectStore::new(64)?;let mut authority = RuntimeCapabilityAuthority::local_development();let uri: FabricUri = "fabric://partners/revenue/team-a.csv".parse()?;store.put(&uri, b"10,20\n30", None)?;
let grant = authority.issue( CapabilityRequest::new( dataset_resource_id, CapabilityPermission::LEASE_QUERY, audience, ) .ttl_secs(300) .nonce(grant_nonce), now_secs,)?;authority.validate( &grant, now_secs, &audience, CapabilityPermission::LEASE_QUERY,)?;if grant.resource_id() != dataset_resource_id { return Err(CleanRoomError::WrongDatasetResource { expected: dataset_resource_id, actual: grant.resource_id(), });}
let query_key = IdempotencyKey::new(format!("clean-room-query:{query_id}"));let mut payload = dataset_resource_id.to_be_bytes().to_vec();payload.extend_from_slice(query_id.as_bytes());payload.push(0);payload.extend_from_slice(uri.to_string().as_bytes());let query_fingerprint = OperationHash::from_canonical_parts( &LogicalResourceName::new("lease-scoped-data-clean-room"), ResourceKind::Workflow, "sum-query", &SchemaId::new("clean-room-query.v1"), &payload,);let reservation = effects.reserve( query_key.clone(), query_fingerprint, None, FenceEpoch(1),)?;if matches!(reservation.value.outcome, IdempotencyOutcome::Completed { .. }) { return Ok(CleanRoomOutcome::Duplicate { query_id: query_id.into() });}
let object = store.get(&uri)?.ok_or(CleanRoomError::MissingDataset(uri.clone()))?;let text = core::str::from_utf8(&object.data).map_err(|_| CleanRoomError::InvalidUtf8)?;let mut sum = 0i64;for field in text.split(|c| c == ',' || c == '\n') { let trimmed = field.trim(); if !trimmed.is_empty() { sum += trimmed .parse::<i64>() .map_err(|_| CleanRoomError::InvalidNumber)?; }}effects.complete( query_key, reservation.version, IdempotencyOutcome::Completed { effect: None }, FenceEpoch(1),)?;# let _ = sum;# Ok::<(), cookbook_recipe_54_lease_scoped_data_clean_room::CleanRoomError>(())Program
use cookbook_recipe_54_lease_scoped_data_clean_room::{ replicated_clean_room_effects, CleanRoom, CleanRoomOutcome, CleanRoomQuery, DatasetGrant,};use grafos_std::capability::RuntimeCapabilityAuthority;use grafos_store::{FabricUri, MemObjectStore};
let store = MemObjectStore::new(64)?;let authority = RuntimeCapabilityAuthority::local_development();let mut clean_room = CleanRoom::new(store, authority, [0x55; 32]);let uri: FabricUri = "fabric://partners/revenue/team-a.csv".parse()?;clean_room.put_dataset(&uri, b"10,20\n30")?;
let grant = clean_room.issue_grant( DatasetGrant { dataset_resource_id: 77, audience: [0x55; 32], ttl_secs: 300, nonce: 9, }, 100,)?;
let mut effects = replicated_clean_room_effects()?;let outcome = clean_room.run_sum_query( &mut effects, 77, &grant, CleanRoomQuery { query_id: "sum-revenue".into(), dataset_uri: uri, now_secs: 100, },)?;
match outcome { CleanRoomOutcome::Completed(result) => assert_eq!(result.sum, 60), CleanRoomOutcome::Duplicate { .. } => unreachable!("first query should run"),}# Ok::<(), cookbook_recipe_54_lease_scoped_data_clean_room::CleanRoomError>(())Design
The grant is not a bearer string with informal meaning. It is an opaque grafOS capability whose fields bind:
- the dataset resource id;
- the intended audience;
- the permission bit required to query;
- an expiry time enforced by the authority;
- a nonce that can be revoked.
The query path validates the grant with the authority before reading from the object store. A revoked grant fails closed with a typed capability error. The fabricBIOS token format is below this API boundary.
In production the CapabilityAuthority is supplied by the scheduler/runtime to
the clean-room service. The local development authority in the snippet is only
for native cookbook tests and examples; unprivileged callers never receive the
authority handle.
Failure Modes
- Wrong audience: capability validation fails before reading data.
- Expired grant: capability validation fails before reading data.
- Revoked grant: revocation is checked before HMAC validation and before storage access.
- Missing dataset: the query returns a typed missing-dataset error.
- Duplicate query: the idempotency store suppresses duplicate effects.
Tests
Run it with:
cargo test -p cookbook-recipe-54-lease-scoped-data-clean-roomThe tests cover real object-store reads, idempotent query recording, and fail-closed revocation.
See also:
crates/grafos-storecrates/grafos-std/src/capability.rscrates/grafos-replicated