Recipe 39 - Cross-Cloud Order Pipeline
Problem: A checkout pipeline needs accepted work to survive an availability-zone, region, or provider failure without rebuilding the usual stack of cloud-specific queues, dedupe tables, DNS failover, and stale-writer guards.
Solution: Model the pipeline as replicated resources:
- a replicated queue for order delivery;
- map-backed consumer cursors and acknowledgments;
- an idempotency store for webhook acceptance and order effects;
- a checkpoint pointer backed by replicated object storage;
- a lease for singleton maintenance work;
- failure-domain observations that prove whether cross-provider processing was explicitly allowed.
The compiled recipe lives in
cookbook/recipe-39-cross-cloud-order-pipeline. It uses public
grafos-replicated resource handles. There are no mocks, provider fallbacks,
or prose-only APIs.
use cookbook_recipe_39_cross_cloud_order_pipeline::{ gcp_region, CrossCloudOrderPipeline, Order,};
let mut pipeline = CrossCloudOrderPipeline::new()?;pipeline.receive_webhook(Order { id: "order-1".into(), cents: 4200,})?;
let processed = pipeline .process_one_from_domain(gcp_region())? .expect("order is available to the explicitly allowed GCP worker group");
assert_eq!(processed.order.id, "order-1");# Ok::<(), grafos_replicated::ReplicatedError>(())Core grafOS API Path
CrossCloudOrderPipeline::new() is only a small packaging layer. The recipe
constructs the replicated resources directly from public grafos-replicated
handles:
use fabricbios_core::lease::FenceEpoch;use grafos_replicated::{ ConsumerGroupName, DeliverySemantics, LogicalResourceName, PolicyHash, ReplicaHealth, ReplicaId, ReplicaLocator, ReplicaRole, ReplicaSetLocator, ReplicatedCheckpoint, ReplicatedFabricQueue, ReplicatedIdempotencyStore, ReplicatedLease, ResourceGeneration, SchemaId,};use cookbook_recipe_39_cross_cloud_order_pipeline::{ aws_zone, cross_cloud_replica_policy, cross_cloud_worker_placement, gcp_region, Order,};
let generation = ResourceGeneration(1);let replicas = cross_cloud_replica_policy();let locator = ReplicaSetLocator::new( generation, vec![ ReplicaLocator { replica_id: ReplicaId::new("orders-aws-a"), domain: aws_zone(), role: ReplicaRole::Voter, health: ReplicaHealth::Healthy, epoch: FenceEpoch(1), content_generation: generation.0, }, ReplicaLocator { replica_id: ReplicaId::new("orders-gcp-central"), domain: gcp_region(), role: ReplicaRole::Voter, health: ReplicaHealth::Healthy, epoch: FenceEpoch(1), content_generation: generation.0, }, ],);
let mut orders = ReplicatedFabricQueue::<Order>::new( LogicalResourceName::new("orders"), SchemaId::new("order.v1"), FenceEpoch(1), replicas.clone(), locator.clone(),)?;
orders.create_consumer_group( ConsumerGroupName::new("checkout-workers"), cross_cloud_worker_placement(), DeliverySemantics::EffectivelyOnceWithIdempotency, FenceEpoch(1),)?;
let effects = ReplicatedIdempotencyStore::new( LogicalResourceName::new("order-effects"), SchemaId::new("order-effect.v1"), FenceEpoch(1), replicas.clone(), locator.clone(),)?;
let checkpoints = ReplicatedCheckpoint::new( LogicalResourceName::new("order-checkpoints"), SchemaId::new("order-checkpoint.v1"), FenceEpoch(1), replicas.clone(), locator.clone(), PolicyHash([9; 32]),)?;
let leaders = ReplicatedLease::new( LogicalResourceName::new("order-leaders"), SchemaId::new("order-leader.v1"), FenceEpoch(1), replicas, locator,)?;# let _ = (orders, effects, checkpoints, leaders);# Ok::<(), grafos_replicated::ReplicatedError>(())The rest of the recipe calls reserve, queue append/poll/ack, checkpoint
save/restore, and lease acquire/renew/release on those handles. The wrapper
keeps the example readable; it does not hide a second API or provider fallback.
What The Recipe Proves
The crate exercises the user-facing resilience contract with real replicated resource primitives:
- duplicate webhooks replay the accepted offset instead of appending a second queue record;
- workers can process from AWS or GCP only because the consumer placement policy allowed both providers;
- an AWS-only worker placement fails closed when asked to poll from GCP;
- checkpoint restore reads the latest CAS pointer and object-backed bytes;
- leader election uses fence epochs and blocks a second holder until expiry;
- an AWS provider-loss drill updates resource observations while the explicitly allowed GCP worker path remains usable.
Runtime Shape
- The webhook handler reserves an idempotency key.
- The order is appended to the replicated queue.
- The accepted queue offset is recorded in the idempotency store before the handler returns success.
- Consumers poll through a placement policy. No provider is tried unless the policy authorized that failure domain.
- Processing completion records an effect and acknowledges the delivery.
- Checkpoints publish immutable object bytes first, then CAS-update the latest checkpoint pointer.
- Singleton maintenance uses a replicated lease, so stale holders fail by fence epoch.
- Failure drills commit signed health observations and assert resource events.
Replicated queues use quorum write acknowledgement by default. This recipe relies on that durable default; local-fast-path acknowledgement is an explicit policy choice for workloads that can tolerate weaker replay semantics.
Tests
Run it with:
cargo test -p cookbook-recipe-39-cross-cloud-order-pipelineThe tests cover:
- cross-cloud order processing from an explicitly allowed GCP domain;
- duplicate webhook replay;
- no hidden cross-provider fallback;
- checkpoint restore;
- lease-based leader election and fencing;
- provider-failure observation with continued allowed consumer placement;
- real SHA-256 content hashes.
See also:
crates/grafos-replicated