grafos cloud
Connect AWS / GCP / Azure provider cells in customer-owned cloud mode.
grafos cloud
Connect / disconnect cloud providers
Usage: grafos cloud [OPTIONS] <COMMAND>
Commands: connect Connect a cloud provider. Stores config under `.grafos/cloud/<provider>.json` in the project root disconnect Remove a provider's connector config status Show configured connectors per provider provision Provision a real fabric cell on a cloud provider cells List provisioned cells from.grafos/cloud/<provider>-cells.json teardown Terminate provisioned cells register-cell Register a provider cell with the public scheduler cell-agent Run a provider cell agent that polls outbound scheduler work bootstrap-cell provider cell identity bootstrap. Generates a local keypair on the cell host, builds a CSR, exchanges it (plus a one-use bootstrap token from `provider init` / `cells/bootstrap/tokens`) for an mTLS cert at `/api/v1/cells/bootstrap/exchange`, and writes `cell-key.pem` + `cell-cert.pem` + `ca-bundle.pem` + `identity.json` into the identity directory. Subsequent `cloud cell-agent` invocations point `--cell-agent-cert` / `--cell-agent-key` at the identity directory's files rotate-cell-identity rotate the cell's mTLS identity. Reads the current cert + key from the identity dir, builds a fresh CSR, presents it (under the current cert as mTLS auth) to `/api/v1/cells/identity/rotate`, atomically replaces `cell-cert.pem` + `identity.json`. Refuses if the current identity is missing, expired, or revoked. Designed to run from a systemd timer (or an idle moment in the agent's main loop) BEFORE `now >= rotates_after` doctor Bundle the connector + STS + cell + tenant-deploy probes . Verifies that an AWS-backed fabric cell can accept tenant-scoped deploy requests right now help Print this message or the help of the given subcommand(s)
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --json Output in JSON format for scripting --wide Show additional columns in table output --no-color Disable color output --pool <POOL> Pool name (default: "default") [default: default] -h, --help Print help -V, --version Print versiongrafos cloud connect
Connect a cloud provider. Stores config under `.grafos/cloud/<provider>.json` in the project root
Usage: grafos cloud connect [OPTIONS] <COMMAND>
Commands: aws Connect AWS. Requires `--mode tenura-managed` OR `--mode customer-owned --role-arn... --external-id...` gcp Connect GCP. Verifies operator gcloud ADC + project access, then writes a connector at `.grafos/cloud/gcp.json`. v1 uses the operator's gcloud auth (no service-account key file is stored). GCP cells follow the registered-cell pattern: each provisioned cell runs `cell-agent` and registers outbound with the public scheduler azure Placeholder until Azure provider work lands help Print this message or the help of the given subcommand(s)
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --json Output in JSON format for scripting --wide Show additional columns in table output --no-color Disable color output --pool <POOL> Pool name (default: "default") [default: default] -h, --help Print help -V, --version Print versiongrafos cloud connect aws
Connect AWS. Requires `--mode tenura-managed` OR `--mode customer-owned --role-arn... --external-id...`
Usage: grafos cloud connect aws [OPTIONS] --mode <MODE>
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var
[env: GRAFOS_FABRIC=]
--mode <MODE> Which cloud ownership mode to configure
Possible values: - tenura-managed: User pays Tenura (included credits or paid balance). Tenura controls the AWS account, launches cells, meters usage, and tears down. Minimizes user setup - customer-owned: User connects their own AWS account via an external-id IAM role + short-lived STS session. Tenura never stores long-lived AWS secrets; the per-run session expires automatically
--json Output in JSON format for scripting
--role-arn <ROLE_ARN> IAM role ARN the CLI will assume via STS (customer-owned mode). Ignored for `--mode tenura-managed`
--external-id <EXTERNAL_ID> External id the IAM role's trust policy expects (customer-owned mode)
--wide Show additional columns in table output
--no-color Disable color output
--regions <REGIONS> Comma-separated allowed regions. Defaults to `us-east-1`
[default: us-east-1]
--max-cost-usd <MAX_COST_USD> Per-run cost cap in USD (customer-owned mode). Omitted means the connector accepts any cost the pre-flight estimate returns
--pool <POOL> Pool name (default: "default")
[default: default]
--skip-verify Skip the live STS AssumeRole + GetCallerIdentity check. The resulting connector is recorded with `verified: false` and is NOT considered ready by `grafos cloud status` or the dashboard. Use only for offline/CI scenarios where a real AWS round-trip is impossible. Default behaviour verifies
-h, --help Print help (see a summary with '-h')
-V, --version Print versiongrafos cloud connect gcp
Connect GCP. Verifies operator gcloud ADC + project access, then writes a connector at `.grafos/cloud/gcp.json`. v1 uses the operator's gcloud auth (no service-account key file is stored). GCP cells follow the registered-cell pattern: each provisioned cell runs `cell-agent` and registers outbound with the public scheduler
Usage: grafos cloud connect gcp [OPTIONS]
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --project <PROJECT> GCP project id. Defaults to `gcloud config get-value project` --json Output in JSON format for scripting --region <REGION> Default region for cells provisioned through this connector [default: us-east1] --max-cost-usd <MAX_COST_USD> Optional cost cap in USD per provisioned cell --wide Show additional columns in table output --no-color Disable color output --skip-verify Skip the live `gcloud auth list` + `gcloud projects describe` round-trip. The resulting connector is recorded with `verified=false` and is NOT considered ready --pool <POOL> Pool name (default: "default") [default: default] -h, --help Print help -V, --version Print versiongrafos cloud connect azure
Placeholder until Azure provider work lands
Usage: grafos cloud connect azure [OPTIONS]
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --json Output in JSON format for scripting --wide Show additional columns in table output --no-color Disable color output --pool <POOL> Pool name (default: "default") [default: default] -h, --help Print help -V, --version Print versiongrafos cloud disconnect
Remove a provider's connector config
Usage: grafos cloud disconnect [OPTIONS] <PROVIDER>
Arguments: <PROVIDER> [possible values: aws, gcp, azure]
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --json Output in JSON format for scripting --wide Show additional columns in table output --no-color Disable color output --pool <POOL> Pool name (default: "default") [default: default] -h, --help Print help -V, --version Print versiongrafos cloud status
Show configured connectors per provider
Usage: grafos cloud status [OPTIONS]
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --json Output in JSON format for scripting --wide Show additional columns in table output --no-color Disable color output --pool <POOL> Pool name (default: "default") [default: default] -h, --help Print help -V, --version Print versiongrafos cloud provision
Provision a real fabric cell on a cloud provider
Usage: grafos cloud provision [OPTIONS] <COMMAND>
Commands: aws Launch a t4g.medium (default) in us-east-1, start fabricbiosd + grafos-scheduler under mtls+token, record the cell in `.grafos/cloud/aws-cells.json`. AWS uses the standalone-cell pattern (each cell is a self-contained mini-fabric) gcp Launch a GCP cell that runs fabricbiosd + cell-scheduler + cell-agent. The cell-agent registers outbound with the public scheduler at scheduler.grafos.tenura.systems; deploys arrive via the long-poll work channel. GCP uses the registered-cell pattern azure Launch an Azure VM. Minimum-viable — creates a Standard_B1s in the operator's active subscription, records the cell in `.grafos/cloud/azure-cells.json`. No cell-bootstrap yet (parallel to `cloud provision gcp --no-bootstrap`); cert install + scheduler start + colocate land in a follow-up slice that bakes a custom Compute Gallery image. Provides the foundation for `scripts/conformance-azure.sh` to lift Azure from 0/10 (placeholder) to 1/10 PASS today help Print this message or the help of the given subcommand(s)
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --json Output in JSON format for scripting --wide Show additional columns in table output --no-color Disable color output --pool <POOL> Pool name (default: "default") [default: default] -h, --help Print help -V, --version Print versiongrafos cloud provision aws
Launch a t4g.medium (default) in us-east-1, start fabricbiosd + grafos-scheduler under mtls+token, record the cell in `.grafos/cloud/aws-cells.json`. AWS uses the standalone-cell pattern (each cell is a self-contained mini-fabric)
Usage: grafos cloud provision aws [OPTIONS]
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var
[env: GRAFOS_FABRIC=]
--region <REGION> AWS region
[default: us-east-1]
--json Output in JSON format for scripting
--mode <MODE> Credential source for the EC2 launch. `auto` (default) reads `.grafos/cloud/aws.json` and picks `customer-owned` if the connector is in that mode, else `tenura-managed`. Pass `--mode customer-owned` explicitly to require that path
Possible values: - tenura-managed: Use ambient AWS credentials (env / profile / IMDS). Tenura owns this account and brokers capacity - customer-owned: Load `.grafos/cloud/aws.json`, AssumeRole using the connector's `role_arn` + `external_id`, run EC2 ops under the resulting short-lived STS session - auto: Resolve from `.grafos/cloud/aws.json` if present; otherwise default to `tenura-managed`. This is the default so existing workflows ("connector says customer-owned, just run") keep working without an explicit `--mode`
[default: auto]
--instance-type <INSTANCE_TYPE> EC2 instance type
[default: t4g.medium]
--wide Show additional columns in table output
--ami <AMI> AMI id to launch. Overrides `.grafos/cloud/aws-ami.txt`. Supply a baked AMI for the fast path (<2 min cold launch); omit to bootstrap-build on the instance (~11 min)
--no-color Disable color output
--key-name <KEY_NAME> SSH key pair registered in AWS
[default: fabricbios-nvmeof-test]
--pool <POOL> Pool name (default: "default")
[default: default]
--key-pem <KEY_PEM> Path to the matching SSH private key on disk
--tenant <TENANT> Tenant name to register on the launched scheduler. Defaults to `mvp-tenant` for parity with the shell MVP
[default: mvp-tenant]
--note <NOTE> Free-form note recorded with the cell for later teardown
-h, --help Print help (see a summary with '-h')
-V, --version Print versiongrafos cloud provision gcp
Launch a GCP cell that runs fabricbiosd + cell-scheduler + cell-agent. The cell-agent registers outbound with the public scheduler at scheduler.grafos.tenura.systems; deploys arrive via the long-poll work channel. GCP uses the registered-cell pattern
Usage: grafos cloud provision gcp [OPTIONS]
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --project <PROJECT> Override the gcloud default project. When omitted, uses the project recorded in `.grafos/cloud/gcp.json` --json Output in JSON format for scripting --zone <ZONE> Compute Engine zone. Defaults to us-east1-d (matches the orchestrator's zone). If the chosen zone is over-subscribed for the requested machine type, gcloud returns a clear error [default: us-east1-d] --machine-type <MACHINE_TYPE> Machine type. e2-micro is the cheapest viable option but the free-tier slot is reserved for the always-on orchestrator; use --preemptible to keep cells cheap (~$0.0005/min) [default: e2-micro] --wide Show additional columns in table output --no-color Disable color output --preemptible Use a preemptible (spot) VM. Cheaper but GCP can yank it at any time. 201.g supersede-on-restart handles preemption cleanly: the cell's run terminal-states as failed, the orchestrator routes future work elsewhere --cell-id <CELL_ID> Cell id reported via the cell-agent registration. When omitted, derived deterministically from the GCP instance id after launch --pool <POOL> Pool name (default: "default") [default: default] --no-bootstrap Skip the post-launch bootstrap step (binaries + certs + systemd). The VM is left bare; you can run `deploy/gcp/cell-bootstrap.sh --instance <name>...` later against the persisted record --scheduler-url <SCHEDULER_URL> Public scheduler URL the cell-agent registers with. Defaults to the production tenura scheduler. Override for staging or a self-hosted scheduler [default: https://scheduler.grafos.tenura.systems:9200] --total-mem <TOTAL_MEM> Total fabric memory the cell advertises to the public scheduler (bytes). The default matches an e2-micro's usable RAM after fabricbiosd + scheduler overhead [default: 536870912] --total-cpu <TOTAL_CPU> Total fabric CPUs the cell advertises. e2-micro is 0.25 vCPU burstable; reporting `1` is the smallest legal positive value [default: 1] --image <IMAGE> Custom image to launch from. Defaults to the family `grafos-cell` in the connector's project — built by `deploy/gcp/image/build-image.sh`. Pin a specific version like `grafos-cell-v0-1-0-git55459402` to override --image-family <IMAGE_FAMILY> Image family alias resolved when `--image` is unset [default: grafos-cell] --image-project <IMAGE_PROJECT> Project that hosts `--image-family`. Defaults to the connector's project (custom images live alongside the cells they bake into). Override only for cross-project image sharing --ca-dir <CA_DIR> Directory containing `ca.pem` + `ca-key.pem` for issuing per-cell certs. Defaults search: env `GRAFOS_CA_DIR`, then `<src>/deploy/compose/tls/ca/` when running from a source tree. The shipped binary requires an explicit `--ca-dir` or `GRAFOS_CA_DIR` since it has no built-in default location --colocate-orchestrator start a second `grafos-scheduler` in orchestrator mode on the same VM (bound to localhost:9201, `--auth-mode none`) and override the cell-agent's scheduler URL to `http://localhost:9201` so it registers there instead of against the public Tenura scheduler. The resulting fabric is fully self-contained — same model `provision_aws` already uses on EC2 — which is what `scripts/conformance-gcp.sh` needs to exercise the cell- agent outbound bootstrap path without depending on a reachable external orchestrator. Default false (the production path still hits `--scheduler-url`) --region <REGION> provider-side cost metadata fields. Threaded through `cell-bootstrap.sh` into `/etc/grafos/cell.env` as `GRAFOS_RONALD_REGION` / `GRAFOS_RONALD_ZONE` / `GRAFOS_RONALD_INSTANCE_TYPE` / `GRAFOS_RONALD_ACCOUNT`. The `grafos cloud cell-agent` systemd unit picks them up automatically via the `#[arg(env =...)]` path on `CellAgentArgs`. The cell-agent then sends them on its register POST so the orchestrator populates `CloudRunRecord.cost` for runs that land here --zone-label <ZONE_LABEL> --instance-type <INSTANCE_TYPE> --account <ACCOUNT> -h, --help Print help -V, --version Print versiongrafos cloud provision azure
Launch an Azure VM. Minimum-viable — creates a Standard_B1s in the operator's active subscription, records the cell in `.grafos/cloud/azure-cells.json`. No cell-bootstrap yet (parallel to `cloud provision gcp --no-bootstrap`); cert install + scheduler start + colocate land in a follow-up slice that bakes a custom Compute Gallery image. Provides the foundation for `scripts/conformance-azure.sh` to lift Azure from 0/10 (placeholder) to 1/10 PASS today
Usage: grafos cloud provision azure [OPTIONS]
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --subscription <SUBSCRIPTION> Azure subscription id. When omitted, uses `az account show --query id` (the operator's currently-active subscription) --json Output in JSON format for scripting --resource-group <RESOURCE_GROUP> Resource group. Created if it doesn't exist (idempotent `az group create`). Default `grafos-cells` keeps every conformance run in one bucket so cleanup is a single `az group delete` if the cell record gets lost [default: grafos-cells] --location <LOCATION> Azure region. `eastus` is the cheapest in our band and has B-series spot capacity most of the time [default: eastus] --wide Show additional columns in table output --no-color Disable color output --vm-size <VM_SIZE> VM size. `Standard_B1s` is the cheapest viable size (~$0.0104/hr, 1 vCPU, 1 GiB). Override for larger machines. For conformance-only runs, B1s is enough [default: Standard_B1s] --image <IMAGE> VM image URN. Defaults to a published Ubuntu 22.04 LTS image. Override with a Compute Gallery image once the custom-image bake slice ships [default: Canonical:0001-com-ubuntu-server-jammy:22_04-lts:latest] --pool <POOL> Pool name (default: "default") [default: default] --cell-id <CELL_ID> Cell id reported when the cell-agent registration slice ships. Today (no-bootstrap) the field is recorded for later correlation but no cell-agent runs --ssh-key <SSH_KEY> SSH key path on disk. Created via `ssh-keygen` if not supplied. Default looks at `~/.ssh/id_rsa.pub` -h, --help Print help -V, --version Print versiongrafos cloud cells
List provisioned cells from.grafos/cloud/<provider>-cells.json
Usage: grafos cloud cells [OPTIONS] <COMMAND>
Commands: aws List AWS cells from.grafos/cloud/aws-cells.json gcp List GCP cells from.grafos/cloud/gcp-cells.json azure List Azure cells from.grafos/cloud/azure-cells.json help Print this message or the help of the given subcommand(s)
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --json Output in JSON format for scripting --wide Show additional columns in table output --no-color Disable color output --pool <POOL> Pool name (default: "default") [default: default] -h, --help Print help -V, --version Print versiongrafos cloud cells aws
List AWS cells from.grafos/cloud/aws-cells.json
Usage: grafos cloud cells aws [OPTIONS]
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --json Output in JSON format for scripting --wide Show additional columns in table output --no-color Disable color output --pool <POOL> Pool name (default: "default") [default: default] -h, --help Print help -V, --version Print versiongrafos cloud cells gcp
List GCP cells from.grafos/cloud/gcp-cells.json
Usage: grafos cloud cells gcp [OPTIONS]
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --json Output in JSON format for scripting --wide Show additional columns in table output --no-color Disable color output --pool <POOL> Pool name (default: "default") [default: default] -h, --help Print help -V, --version Print versiongrafos cloud cells azure
List Azure cells from.grafos/cloud/azure-cells.json
Usage: grafos cloud cells azure [OPTIONS]
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --json Output in JSON format for scripting --wide Show additional columns in table output --no-color Disable color output --pool <POOL> Pool name (default: "default") [default: default] -h, --help Print help -V, --version Print versiongrafos cloud teardown
Terminate provisioned cells
Usage: grafos cloud teardown [OPTIONS] <COMMAND>
Commands: aws Terminate one AWS cell (--cell-id) or all of them (--all) gcp Terminate one GCP cell (--cell-id) or all of them (--all) azure Terminate one Azure cell (--cell-id) or all of them (--all). Best-effort cleanup of associated NIC + public IP + NSG so the cell doesn't leak tail-resources on the Azure bill help Print this message or the help of the given subcommand(s)
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --json Output in JSON format for scripting --wide Show additional columns in table output --no-color Disable color output --pool <POOL> Pool name (default: "default") [default: default] -h, --help Print help -V, --version Print versiongrafos cloud teardown aws
Terminate one AWS cell (--cell-id) or all of them (--all)
Usage: grafos cloud teardown aws [OPTIONS]
Options: --cell-id <CELL_ID> Terminate a specific cell from aws-cells.json --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --all Terminate all recorded AWS cells --json Output in JSON format for scripting --wide Show additional columns in table output --no-color Disable color output --pool <POOL> Pool name (default: "default") [default: default] -h, --help Print help -V, --version Print versiongrafos cloud teardown gcp
Terminate one GCP cell (--cell-id) or all of them (--all)
Usage: grafos cloud teardown gcp [OPTIONS]
Options: --cell-id <CELL_ID> Terminate a single cell by id --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --all Terminate every cell in `.grafos/cloud/gcp-cells.json` --json Output in JSON format for scripting --wide Show additional columns in table output --no-color Disable color output --pool <POOL> Pool name (default: "default") [default: default] -h, --help Print help -V, --version Print versiongrafos cloud teardown azure
Terminate one Azure cell (--cell-id) or all of them (--all). Best-effort cleanup of associated NIC + public IP + NSG so the cell doesn't leak tail-resources on the Azure bill
Usage: grafos cloud teardown azure [OPTIONS]
Options: --cell-id <CELL_ID> Terminate a single cell by id --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --all Terminate every cell in `.grafos/cloud/azure-cells.json` --json Output in JSON format for scripting --wide Show additional columns in table output --no-color Disable color output --pool <POOL> Pool name (default: "default") [default: default] -h, --help Print help -V, --version Print versiongrafos cloud register-cell
Register a provider cell with the public scheduler
Usage: grafos cloud register-cell [OPTIONS] --cell-agent-cert <CELL_AGENT_CERT> --cell-agent-key <CELL_AGENT_KEY> --cell-url <CELL_URL> --cell-id <CELL_ID>
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --scheduler <SCHEDULER> Public scheduler/orchestrator base URL. When omitted, defaults to GRAFOS_PUBLIC_SCHEDULER_URL or scheduler.grafos.tenura.systems [env: GRAFOS_SCHEDULER=] --cell-agent-cert <CELL_AGENT_CERT> Cell-agent client cert (PEM). the cert's URI SAN must be `urn:fabricbios:cell:<provider>:<cell_id>` and match `--provider` / `--cell-id`. Issued via `grafos admin issue-cert --role cell --provider <p> --cell-id <id>` [env: GRAFOS_CELL_AGENT_CERT=] --json Output in JSON format for scripting --cell-agent-key <CELL_AGENT_KEY> Cell-agent client key (PEM) corresponding to `--cell-agent-cert` [env: GRAFOS_CELL_AGENT_KEY=] --wide Show additional columns in table output --no-color Disable color output --scheduler-ca <SCHEDULER_CA> Scheduler CA bundle for verifying the orchestrator's TLS cert. Required when --scheduler is https:// (the normal case) [env: GRAFOS_SCHEDULER_CA=] --pool <POOL> Pool name (default: "default") [default: default] --provider <PROVIDER> Provider identity for this cell, e.g. docker or aws [default: lab] --control-mode <CONTROL_MODE> How the scheduler currently reaches this cell. V1 is http-forward; a later Phase 201 subtask replaces this with a persistent outbound work channel [default: http-forward] --cell-url <CELL_URL> Reachable cell scheduler URL for v1 HTTP-forward dispatch --cell-id <CELL_ID> Numeric cell id reported to the scheduler --role <ROLE> Scheduler role for the cell [default: active] --leader-epoch <LEADER_EPOCH> Leader epoch reported for this cell [default: 0] --nodes <NODES> Number of nodes represented by this cell [default: 1] --total-mem <TOTAL_MEM> Total memory capacity in bytes [default: 0] --available-mem <AVAILABLE_MEM> Available memory capacity in bytes [default: 0] --total-cpu <TOTAL_CPU> Total CPU capacity [default: 0] --available-cpu <AVAILABLE_CPU> Available CPU capacity [default: 0] --unhealthy Register the cell as unhealthy -h, --help Print help -V, --version Print versiongrafos cloud cell-agent
Run a provider cell agent that polls outbound scheduler work
Usage: grafos cloud cell-agent [OPTIONS] --scheduler <SCHEDULER> --cell-agent-cert <CELL_AGENT_CERT> --cell-agent-key <CELL_AGENT_KEY> --cell-url <CELL_URL>
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --scheduler <SCHEDULER> Public or internal scheduler URL, e.g. https://scheduler:9200 [env: GRAFOS_SCHEDULER=] --cell-agent-cert <CELL_AGENT_CERT> Cell-agent client certificate (PEM). the cert's URI SAN must be `urn:fabricbios:cell:<provider>:<cell_id>` and match `--provider`/`--cell-id`. Issued by `grafos admin issue-cert --role cell` [env: GRAFOS_CELL_AGENT_CERT=] --json Output in JSON format for scripting --cell-agent-key <CELL_AGENT_KEY> Cell-agent client private key (PEM) corresponding to `--cell-agent-cert` [env: GRAFOS_CELL_AGENT_KEY=] --wide Show additional columns in table output --no-color Disable color output --provider <PROVIDER> Provider identity for this agent, e.g. docker [env: GRAFOS_RONALD_PROVIDER=] [default: lab] --cell-id <CELL_ID> Numeric cell id reported to the scheduler [env: GRAFOS_RONALD_CELL_ID=] [default: 7] --pool <POOL> Pool name (default: "default") [default: default] --cell-url <CELL_URL> Local cell scheduler URL. The global scheduler never calls this URL [env: GRAFOS_CELL_SCHEDULER_URL=] --scheduler-ca <SCHEDULER_CA> CA bundle used to verify scheduler TLS [env: GRAFOS_SCHEDULER_CA=] --cell-ca <CELL_CA> CA bundle used to verify local cell scheduler TLS [env: GRAFOS_CELL_CA=] --cell-bearer <CELL_BEARER> Bearer token presented to the local cell scheduler for /api/v1/deploy [env: GRAFOS_CELL_BEARER=] --poll-interval-ms <POLL_INTERVAL_MS> Poll interval when no work is available. Only kicks in for a `wait_seconds=0` (immediate-response) poll loop. With the default `--wait-seconds=30`, the scheduler holds the connection until work arrives or the long-poll window expires — the agent reconnects immediately, so this knob effectively becomes the "no work" idle gap and stays small [env: GRAFOS_CELL_AGENT_POLL_MS=] [default: 1000] --wait-seconds <WAIT_SECONDS> Long-poll wait window sent on each `/api/v1/cells/work/poll`. The scheduler caps this at 60s server-side; the agent default of 30s keeps the round-trip well below typical L7 idle limits while waking promptly when work is enqueued [env: GRAFOS_CELL_AGENT_WAIT_SECS=] [default: 30] --timeout-secs <TIMEOUT_SECS> HTTP read timeout. Must exceed `--wait-seconds` plus a few seconds of slack so the long-poll connection doesn't time out under the scheduler's parked-condvar wait [env: GRAFOS_CELL_AGENT_TIMEOUT_SECS=] [default: 300] --max-iterations <MAX_ITERATIONS> Number of poll iterations before exiting. Omit to run forever --once Register and process at most one assignment, then exit --state-dir <STATE_DIR> Optional state directory for in-flight assignment resume. when set, the agent persists `(provider, cell_id, assignment_id, generation)` to `<state_dir>/cell-agent-inflight.json` immediately on accepting work and clears it on completion (200 or 409). On startup, if the file exists, the agent finalizes that assignment first before polling for new work, ensuring a crash mid-completion can't lose artifact state [env: GRAFOS_CELL_AGENT_STATE_DIR=] --max-restart-attempts <MAX_RESTART_ATTEMPTS> Maximum reconnect attempts after transient HTTP errors before the agent gives up and exits. Omit (or 0) for unlimited retries — the default for production cells. Useful for tests that want a bounded run [default: 0] --total-mem <TOTAL_MEM> Advertised memory capacity [env: GRAFOS_RONALD_TOTAL_MEM=] [default: 1073741824] --available-mem <AVAILABLE_MEM> Advertised available memory [env: GRAFOS_RONALD_AVAILABLE_MEM=] [default: 1073741824] --total-cpu <TOTAL_CPU> Advertised CPU capacity [env: GRAFOS_RONALD_TOTAL_CPU=] [default: 4] --available-cpu <AVAILABLE_CPU> Advertised available CPU capacity [env: GRAFOS_RONALD_AVAILABLE_CPU=] [default: 4] --region <REGION> /9 — provider-side cost metadata sent on register so the orchestrator can populate `CloudRunRecord.cost` for every run that lands on this cell. Optional — older cells that don't pass these flags keep registering (cost stays `None` on the run record). Per-cloud cell-bootstrap scripts derive these from their cloud-side context (AWS region, GCP zone, etc.) and pass them through [env: GRAFOS_RONALD_REGION=] --zone <ZONE> [env: GRAFOS_RONALD_ZONE=] --instance-type <INSTANCE_TYPE> [env: GRAFOS_RONALD_INSTANCE_TYPE=] --account <ACCOUNT> Cloud account / project / subscription id. AWS account id (12 digits), GCP project id, Azure subscription id. Lab cells may leave this unset [env: GRAFOS_RONALD_ACCOUNT=] -h, --help Print help -V, --version Print versiongrafos cloud bootstrap-cell
provider cell identity bootstrap. Generates a local keypair on the cell host, builds a CSR, exchanges it (plus a one-use bootstrap token from `provider init` / `cells/bootstrap/tokens`) for an mTLS cert at `/api/v1/cells/bootstrap/exchange`, and writes `cell-key.pem` + `cell-cert.pem` + `ca-bundle.pem` + `identity.json` into the identity directory. Subsequent `cloud cell-agent` invocations point `--cell-agent-cert` / `--cell-agent-key` at the identity directory's files
Usage: grafos cloud bootstrap-cell [OPTIONS] --bootstrap-token <BOOTSTRAP_TOKEN> --scheduler <SCHEDULER> --provider <PROVIDER> --identity-dir <IDENTITY_DIR>
Options: --bootstrap-token <BOOTSTRAP_TOKEN> One-use bootstrap token issued by the scheduler (mint via `grafos provider init` or `POST /api/v1/cells/bootstrap/tokens`). The token is consumed on first successful exchange; replay loses with `bootstrap_already_used` [env: GRAFOS_BOOTSTRAP_TOKEN=] --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --json Output in JSON format for scripting --scheduler <SCHEDULER> Scheduler base URL (orchestrator that minted the token) [env: GRAFOS_SCHEDULER=] --provider <PROVIDER> Provider id this cell is being onboarded for. MUST match the token's claimed provider; mismatch is rejected with HTTP 403 `bootstrap_provider_mismatch` [env: GRAFOS_PROVIDER=] --wide Show additional columns in table output --cell-id-hint <CELL_ID_HINT> Optional cell id hint. When `0` (the default) the scheduler uses the cell_id encoded in the bootstrap token. Pass a non-zero value only when the operator wants to double-check it matches the token [env: GRAFOS_CELL_ID_HINT=] [default: 0] --no-color Disable color output --identity-dir <IDENTITY_DIR> Directory where the cell's private key + issued cert + CA bundle + identity.json will live. Created with 0700 perms if missing; existing files (other than `identity.json`) are preserved unless `--force` is set [env: GRAFOS_CELL_IDENTITY_DIR=] --pool <POOL> Pool name (default: "default") [default: default] --scheduler-ca <SCHEDULER_CA> CA bundle the CLI uses to verify the scheduler's TLS cert. Plain HTTP works without this (test/dev only) [env: GRAFOS_SCHEDULER_CA=] --agent-version <AGENT_VERSION> Cell-agent version string echoed in the bootstrap audit event. Does NOT affect cert issuance [env: GRAFOS_CELL_AGENT_VERSION=] [default: "grafos-cli/cloud bootstrap-cell"] --force Overwrite `cell-key.pem` / `cell-cert.pem` / `identity.json` if present. Defaults to refusing — `rotate-cell-identity` is the right command for steady-state rotation -h, --help Print help -V, --version Print versiongrafos cloud rotate-cell-identity
rotate the cell's mTLS identity. Reads the current cert + key from the identity dir, builds a fresh CSR, presents it (under the current cert as mTLS auth) to `/api/v1/cells/identity/rotate`, atomically replaces `cell-cert.pem` + `identity.json`. Refuses if the current identity is missing, expired, or revoked. Designed to run from a systemd timer (or an idle moment in the agent's main loop) BEFORE `now >= rotates_after`
Usage: grafos cloud rotate-cell-identity [OPTIONS] --identity-dir <IDENTITY_DIR>
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --identity-dir <IDENTITY_DIR> Identity directory (the same path used by `bootstrap-cell --identity-dir`) [env: GRAFOS_CELL_IDENTITY_DIR=] --json Output in JSON format for scripting --scheduler <SCHEDULER> Scheduler base URL. Defaults to the value persisted in `identity.json` from the original bootstrap; pass this flag only when the URL has changed under the cell [env: GRAFOS_SCHEDULER=] --scheduler-ca <SCHEDULER_CA> CA bundle used to verify scheduler TLS [env: GRAFOS_SCHEDULER_CA=] --wide Show additional columns in table output --no-color Disable color output --rotate-key Generate a new keypair instead of reusing the existing one. Defaults to false: rotation rolls the cert but preserves the key, which is the design-doc's expected 12-h-rotates-after pattern. Roll the key only when policy requires it (e.g. compromise response) --pool <POOL> Pool name (default: "default") [default: default] -h, --help Print help -V, --version Print versiongrafos cloud doctor
Bundle the connector + STS + cell + tenant-deploy probes . Verifies that an AWS-backed fabric cell can accept tenant-scoped deploy requests right now
Usage: grafos cloud doctor [OPTIONS] <COMMAND>
Commands: aws `grafos cloud doctor aws --mode tenura-managed|customer-owned` help Print this message or the help of the given subcommand(s)
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var [env: GRAFOS_FABRIC=] --json Output in JSON format for scripting --wide Show additional columns in table output --no-color Disable color output --pool <POOL> Pool name (default: "default") [default: default] -h, --help Print help -V, --version Print versiongrafos cloud doctor aws
`grafos cloud doctor aws --mode tenura-managed|customer-owned`
Usage: grafos cloud doctor aws [OPTIONS] --mode <MODE>
Options: --fabric <FABRIC> Fabric address (host:port). Overrides GRAFOS_FABRIC env var
[env: GRAFOS_FABRIC=]
--mode <MODE> Which ownership mode to probe. Tenura-managed and customer-owned share the same probe set; the difference is whether the `connector` and `sts` sections run
Possible values: - tenura-managed: Tenura-managed cells: skip the per-user STS verify (no per-user role exists) and probe whatever cells the cell-pool API returned. v1 of the doctor uses the same on-disk `aws-cells.json` as customer-owned for the probe set; the dedicated Tenura pool API lands in 200.b - customer-owned: Customer-owned cells: full set — connector + STS + cell + lease
--cell-id <CELL_ID> Limit cell discovery to a specific cell id. Default is to probe every cell of the matching mode in `.grafos/cloud/aws-cells.json`
--tenant <TENANT> Tenant identity for the lease probe. Defaults to the cell's `tenant_name` (the value passed to `grafos cloud provision aws --tenant...`). Override here if the cell has additional tenants registered
--wide Show additional columns in table output
--json Emit machine-readable JSON instead of the human-friendly section table. Useful for the dashboard's connector-state probe (200.f)
--no-color Disable color output
--pool <POOL> Pool name (default: "default")
[default: default]
-h, --help Print help (see a summary with '-h')
-V, --version Print version