Playbook: triage secret failures
This playbook walks through diagnosing SECRETS_* failures step by step. Use it when a job fails before script execution with a secrets-related error.
If the failure is not secrets-related, start with the general Diagnose a failing run playbook instead.
Related references (keep open)
- Secrets error codes (full code reference)
- Secrets concept (mental model, injection modes, redaction)
- Secrets workflow authoring (YAML syntax, provider URIs)
- Diagnostics ladder (general triage sequence)
- Runtime logs contract (file layout + pointer fields)
Before you start
Locate the run logs directory:
.loom/.runtime/logs/<run_id>/
If you don't have the run_id, find it from the CLI output or the most recent receipt in .loom/.runtime/receipts/.
Examples below use the release-installed loom binary. If you're contributing from a Loom checkout, substitute the repo-local command you already use, such as nix develop -c ./bin/loom.
Step-by-step
1) Confirm the pipeline failed on a secrets error
Open pipeline/summary.json and check the exit status.
Then open pipeline/manifest.json to find failing_job_id and failing_job_manifest_path.
2) Read the failing job manifest
Open jobs/<job_id>/manifest.json.
Secrets failures usually show up in a system section (not a user script step), because secrets resolve before script execution. Look for:
failing_section— if it saysprovideror a system section, this is likely a secrets resolution failure.system_sections[].events_path— the event stream for the failing system section.
If failing_step_events_path points to a user step instead, the failure may be a script that depends on a missing (optional) secret. Check the step events for clues.
3) Read the failing event stream
Open the events file from the job manifest (typically jobs/<job_id>/system/provider/events.jsonl).
Search for events with a SECRETS_* error code. The event metadata includes:
- Provider scheme (
env,keepass, etc.) - Secret variable name (the key from the
secretsblock) - Error code (one of the codes below)
Extract the smallest excerpt that shows the error — typically 5–10 lines.
4) Match the error code and fix
| Code | What it means | First thing to check |
|---|---|---|
SECRETS_PROVIDER_UNAVAILABLE | Provider backend cannot be reached or auth/config is invalid | Verify LOOM_KEEPASS_DB_<ALIAS>_* env vars (KeePass) or that the env var exists (env://). |
SECRETS_REF_INVALID | The ref URI is malformed or uses an unsupported scheme | Check for typos in the scheme (keepas:// vs keepass://), missing scheme, or empty ref. |
SECRETS_REF_NOT_FOUND | Provider is reachable but the referenced entry/field does not exist | For env://: is the variable exported? For keepass://: does the entry path/field match? |
SECRETS_REQUIRED_MISSING | A required secret could not be resolved (catch-all) | Verify ref URI and provider access. Set required: false if the secret is non-critical. |
SECRETS_UNSAFE_DEBUG_TRACE | CI_DEBUG_TRACE=true with file: false secrets | Remove CI_DEBUG_TRACE, switch secrets to file: true, or remove secrets from the job. |
For full details on each code, see Secrets error codes.
5) Common fix patterns
"My KeePass secret fails with SECRETS_PROVIDER_UNAVAILABLE"
The vault alias mapping or credentials are not configured in the runtime environment. Verify these env vars are set:
LOOM_KEEPASS_DB_<ALIAS>_PATH— absolute path to the.kdbxfile.LOOM_KEEPASS_DB_<ALIAS>_PASSWORD_ENV— name of env var holding the master password.- The env var named in
PASSWORD_ENV(orKEYFILE_ENV) must also be set with the actual credential.
export LOOM_KEEPASS_DB_LOCAL_PATH="$HOME/.config/loom/secrets/local.kdbx"
export LOOM_KEEPASS_DB_LOCAL_PASSWORD_ENV="KEEPASS_LOCAL_PASSWORD"
export KEEPASS_LOCAL_PASSWORD="your-master-password"
loom run --local --workflow .loom/workflow.yml
"My env:// secret is not found"
The referenced environment variable is not set in the shell that runs Loom. Export it before running:
export MY_TOKEN=actual-value
loom run --local --workflow .loom/workflow.yml
"Debug trace is blocked"
Loom prevents CI_DEBUG_TRACE=true when any secret uses file: false, because shell tracing would expose the secret value. Either:
- Remove
CI_DEBUG_TRACE=truefrom the job's variables. - Change the secrets to
file: true(the default) — file paths are safe to trace.
"A secret is optional but the job still behaves wrong"
Optional secrets (required: false) are silently omitted when unresolved. If your script assumes the variable is always set, it may fail with a confusing error unrelated to secrets. Check whether the script handles the missing-variable case.
6) Verify the fix
After making changes to the workflow or environment, re-run:
loom run --local --workflow .loom/workflow.yml
Then confirm pipeline/summary.json shows a passing status.
Escalation
- If the error code is unclear or absent: widen to the general Diagnose a failing run playbook.
- If you suspect a resolver bug (for example,
env://firesSECRETS_PROVIDER_UNAVAILABLE): file an issue with the event excerpt and the workflow snippet. See What to share for packaging evidence safely.
Privacy & redaction checklist (before you paste)
Before sharing any excerpt from a secrets failure:
- Redact any resolved secret values (should already be masked as
[REDACTED:SECRET_<NAME>]in logs). - Redact vault paths or entry names if they reveal sensitive internal structure.
- Redact environment variable values.
- State what you redacted so helpers can reason about the missing context.