Skip to main content

Playbook: triage secret failures

This playbook walks through diagnosing SECRETS_* failures step by step. Use it when a job fails before script execution with a secrets-related error.

If the failure is not secrets-related, start with the general Diagnose a failing run playbook instead.

Before you start

Locate the run logs directory:

.loom/.runtime/logs/<run_id>/

If you don't have the run_id, find it from the CLI output or the most recent receipt in .loom/.runtime/receipts/.

Examples below use the release-installed loom binary. If you're contributing from a Loom checkout, substitute the repo-local command you already use, such as nix develop -c ./bin/loom.

Step-by-step

1) Confirm the pipeline failed on a secrets error

Open pipeline/summary.json and check the exit status.

Then open pipeline/manifest.json to find failing_job_id and failing_job_manifest_path.

2) Read the failing job manifest

Open jobs/<job_id>/manifest.json.

Secrets failures usually show up in a system section (not a user script step), because secrets resolve before script execution. Look for:

  • failing_section — if it says provider or a system section, this is likely a secrets resolution failure.
  • system_sections[].events_path — the event stream for the failing system section.

If failing_step_events_path points to a user step instead, the failure may be a script that depends on a missing (optional) secret. Check the step events for clues.

3) Read the failing event stream

Open the events file from the job manifest (typically jobs/<job_id>/system/provider/events.jsonl).

Search for events with a SECRETS_* error code. The event metadata includes:

  • Provider scheme (env, keepass, etc.)
  • Secret variable name (the key from the secrets block)
  • Error code (one of the codes below)

Extract the smallest excerpt that shows the error — typically 5–10 lines.

4) Match the error code and fix

CodeWhat it meansFirst thing to check
SECRETS_PROVIDER_UNAVAILABLEProvider backend cannot be reached or auth/config is invalidVerify LOOM_KEEPASS_DB_<ALIAS>_* env vars (KeePass) or that the env var exists (env://).
SECRETS_REF_INVALIDThe ref URI is malformed or uses an unsupported schemeCheck for typos in the scheme (keepas:// vs keepass://), missing scheme, or empty ref.
SECRETS_REF_NOT_FOUNDProvider is reachable but the referenced entry/field does not existFor env://: is the variable exported? For keepass://: does the entry path/field match?
SECRETS_REQUIRED_MISSINGA required secret could not be resolved (catch-all)Verify ref URI and provider access. Set required: false if the secret is non-critical.
SECRETS_UNSAFE_DEBUG_TRACECI_DEBUG_TRACE=true with file: false secretsRemove CI_DEBUG_TRACE, switch secrets to file: true, or remove secrets from the job.

For full details on each code, see Secrets error codes.

5) Common fix patterns

"My KeePass secret fails with SECRETS_PROVIDER_UNAVAILABLE"

The vault alias mapping or credentials are not configured in the runtime environment. Verify these env vars are set:

  • LOOM_KEEPASS_DB_<ALIAS>_PATH — absolute path to the .kdbx file.
  • LOOM_KEEPASS_DB_<ALIAS>_PASSWORD_ENV — name of env var holding the master password.
  • The env var named in PASSWORD_ENV (or KEYFILE_ENV) must also be set with the actual credential.
export LOOM_KEEPASS_DB_LOCAL_PATH="$HOME/.config/loom/secrets/local.kdbx"
export LOOM_KEEPASS_DB_LOCAL_PASSWORD_ENV="KEEPASS_LOCAL_PASSWORD"
export KEEPASS_LOCAL_PASSWORD="your-master-password"
loom run --local --workflow .loom/workflow.yml

"My env:// secret is not found"

The referenced environment variable is not set in the shell that runs Loom. Export it before running:

export MY_TOKEN=actual-value
loom run --local --workflow .loom/workflow.yml

"Debug trace is blocked"

Loom prevents CI_DEBUG_TRACE=true when any secret uses file: false, because shell tracing would expose the secret value. Either:

  • Remove CI_DEBUG_TRACE=true from the job's variables.
  • Change the secrets to file: true (the default) — file paths are safe to trace.

"A secret is optional but the job still behaves wrong"

Optional secrets (required: false) are silently omitted when unresolved. If your script assumes the variable is always set, it may fail with a confusing error unrelated to secrets. Check whether the script handles the missing-variable case.

6) Verify the fix

After making changes to the workflow or environment, re-run:

loom run --local --workflow .loom/workflow.yml

Then confirm pipeline/summary.json shows a passing status.

Escalation

  • If the error code is unclear or absent: widen to the general Diagnose a failing run playbook.
  • If you suspect a resolver bug (for example, env:// fires SECRETS_PROVIDER_UNAVAILABLE): file an issue with the event excerpt and the workflow snippet. See What to share for packaging evidence safely.

Privacy & redaction checklist (before you paste)

Before sharing any excerpt from a secrets failure:

  • Redact any resolved secret values (should already be masked as [REDACTED:SECRET_<NAME>] in logs).
  • Redact vault paths or entry names if they reveal sensitive internal structure.
  • Redact environment variable values.
  • State what you redacted so helpers can reason about the missing context.