Recipe 7: Zero-Copy Microservices That Talk Through Memory, Not Sockets

Situation

Microservice architectures often pay an invisible tax:

serialization/deserialization
kernel/user crossings
TCP stack overhead
per-hop latency amplification across multi-hop critical paths

If services are co-located (same node, same rack, same fabric), you want a fast path that behaves like RPC but executes like shared-memory message passing.

In grafOS, “transport” can be a leased memory region. If services are on the same node, requests are literal memory writes/reads. If they are on different nodes, the fabric bridges the leased memory operations over a data plane.

What You Build

A tiny “service mesh” pattern:

A shared leased memory arena per service pair (or per client/server group).
grafos-rpc request/response protocol in that arena.
FabricDns for discovery of service endpoints.

The key result: the API looks like RPC, but the hot path is memory I/O.

Building Blocks

grafos_rpc::{RpcClient, RpcServer, RpcHandler}
grafos_std::mem::MemBuilder for the arena lease
grafos_net::FabricDns for service discovery

See:

Design

Lease Layout

grafos-rpc uses fixed offsets:

request region at offset 0
response region at offset 32768

The lease must be large enough to hold both regions and payloads.

Ownership Model

Common patterns:

one lease per client/server pair
one lease per server, multiple clients (requires arbitration for concurrent requests)

The simplest recipe uses:

one client, one server, one lease

Safety

Because this is shared state, you must:

bound payload size
validate lengths
treat malformed frames as errors

grafos-rpc already enforces a max payload and uses a status byte state machine.

Walkthrough (Implementation Sketch)

Core grafOS API Path

The service pair shares one memory lease. The server polls that lease for requests; the client writes requests and waits for responses through the same lease-backed RPC protocol.

use grafos_rpc::{RpcClient, RpcHandler, RpcServer};
use grafos_std::error::{FabricError, Result};
use grafos_std::mem::MemBuilder;

let lease = MemBuilder::new().min_bytes(64 * 1024).lease_secs(300).acquire()?;

struct Echo;
impl RpcHandler for Echo {
    fn handle(&self, method_id: u32, payload: &[u8]) -> Result<Vec<u8>> {
        if method_id != 1 {
            return Err(FabricError::IoError(-1));
        }
        Ok(payload.to_vec())
    }
}

let mut client = RpcClient::new(&lease);
let server = RpcServer::new(&lease);

// Server event loop.
let handled = server.poll_once(&Echo)?;

// Client task.
let response: Vec<u8> = client.call(1, &b"hello".to_vec())?;
# let _ = (handled, response);
# Ok::<(), grafos_std::FabricError>(())

The important point: the request/response bytes never traverse TCP; they traverse the leased memory abstraction.

Failure Modes

Disconnected: memory I/O fails; treat as transport failure.
LeaseExpired: the shared arena died; recreate it and re-handshake.
Contention: with multiple clients, you need a multiplexing protocol (outside scope of this simplest recipe).

Observability

Track:

request count / response count
p50/p95/p99 call latency
payload size histogram
LeaseExpired events

Variations

Multi-client server: add a per-client slot or ring buffer of requests.
Streaming: use FabricQueue for one-way streaming and RPC for control.
Remote transparency: keep the same API; the fabric bridges cross-node memory I/O.