Skip to content

Recipe 47: Content-Addressed Object Storage

What You Build

Build a content-addressable object store over grafos-store::MemObjectStore (hot, leased memory) wrapped by TieredObjectStore (spills to BlockObjectStore on durable storage). The key is the SHA-256 of the bytes; CRC32 is verified on every read.

Source

cookbook/recipe-47-object-storage/ in the source tree.

The recipe builds the real manifest shape: take input bytes, compute SHA-256 + CRC32, return { sha256, crc32, size_bytes }. Storage layers use the SHA-256 as the content address and the CRC32 to verify reads.

Core grafOS API Path

The public object path is FabricUri plus the ObjectStore trait. A hot memory-backed store uses MemObjectStore; a hot+cold store uses TieredObjectStore.

use grafos_store::{FabricUri, MemObjectStore, ObjectStore, TieredObjectStore};
let mut hot = MemObjectStore::new(64)?;
let manifest = compute(b"payload");
let uri = FabricUri::new("default", "reports", &manifest.sha256)?;
hot.put(&uri, b"payload", None)?;
let object = hot.get(&uri)?.expect("object exists");
assert_eq!(object.data, b"payload");
assert_eq!(object.info.crc32, manifest.crc32);
let mut tiered = TieredObjectStore::new(32, 256, 16)?;
tiered.put(&uri, b"payload", None)?;
tiered.checkpoint()?;
# Ok::<(), grafos_std::FabricError>(())

The recipe helper is just the content-addressed write packaged as a function:

pub fn compute(input: &[u8]) -> ObjectManifest {
ObjectManifest {
sha256: sha256_hex(input),
crc32: crc32(input),
size_bytes: input.len(),
}
}
pub fn put_content_addressed<S: ObjectStore>(
store: &mut S,
pool: &str,
bucket: &str,
data: &[u8],
) -> FabricResult<ObjectManifest> {
let manifest = compute(data);
let uri = FabricUri::new(pool, bucket, &manifest.sha256)?;
store.put(&uri, data, None)?;
Ok(manifest)
}

What’s interesting

  1. Two-layer integrity. SHA-256 is the address; CRC32 is the on-read sanity check. CRC32 catches in-flight bit flips that the SHA-256 lookup wouldn’t catch by itself (you only know what you asked for, not what the storage returned).
  2. The CRC32 is the standard polynomial. 0xedb88320 matches what zip and gzip use; crc32(b"123456789") is 0xCBF43926. Test pins this.
  3. Real hash, real checksum. sha2::Sha256 computes the content address and the CRC32 implementation is pinned by the standard 123456789 test vector.
  4. Real store path. put_content_addressed accepts any grafos_store::ObjectStore, builds a FabricUri, and writes the object through the public store trait.

Failure Behavior

  • The content address is deterministic: repeated writes of the same bytes produce the same SHA-256 key and manifest.
  • FabricUri::new rejects empty pool, bucket, or key values.
  • ObjectStore::put returns the underlying fabric error if the backing memory or block lease cannot accept the write.
  • Reads through ObjectStore::get verify CRC32 against stored metadata before returning data.

Run And Verify

Terminal window
cargo test -p cookbook-recipe-47-object-storage

Expected: the tests pin empty input, the standard CRC32 vector, the SHA-256 of abc, manifest serialization, and the real ObjectStore helper path.

Adapt It

Use bucket names that match your workload boundary, and choose MemObjectStore, BlockObjectStore, or TieredObjectStore according to durability and latency needs. Keep the SHA-256 key in receipts so consumers can verify they loaded the exact object that was written.