Workbooks Documentation

Agents

An agent is a loop: send a prompt to an LLM, receive a tool call, execute it, feed the result back, repeat until the LLM calls done or the wall-clock timeout expires. Agents have no iteration cap — they run until they finish or time out.

The agent module is runtime/host/agent.ex.

State

Each agent run carries a state record:

FieldTypeDescription
vfsmapVirtual filesystem: path → content. Shared across tools in a run.
modelstringLLM model identifier (e.g. xiaomi/mimo-v2.5)
tenantstringScopes the variable store for this run
stepintegerCurrent tool call index
eventslistOrdered log of all tool calls and results
execatom:sandbox or :host — whether shell runs sandboxed
workdirstringWorking directory for shell tool calls

Events are appended to _steps.jsonl in the workdir on every tool call. This gives a persistent, queryable trace of every agent run. wb telemetry reads these files.

Tools

Agents have exactly eight tools. These are the only actions an agent can take:

shell

shell(command: string) → {stdout: string, exit_code: int}

Runs a shell command. In :sandbox mode, the command runs inside the WASM sandbox (limited to PATH tools + the VFS). In :host mode it runs directly on the host in workdir. Default is :host for agents invoked via wb run.

search

search(query: string) → [result]

Semantic search over the workbooks index. Returns matching workbook headlines and their content. Used by agents to discover existing workbooks and toolkits.

fetch

fetch(url: string) → string

HTTP GET. Returns the response body as a string. Used for API calls and documentation lookups.

vfswrite

vfs_write(path: string, content: string) → void

Writes a file to the virtual filesystem. Changes are visible to subsequent vfs_read and shell calls within the same run. VFS writes do not touch the host filesystem unless the agent is in :host mode with a real workdir.

vfsread

vfs_read(path: string) → string

Reads a file from the virtual filesystem. Falls through to the real filesystem if the path is not in the VFS.

wb

wb(args: [string]) → string

Runs a wb CLI command in-process. Equivalent to running wb query, wb tangle, wb lint, etc. from the shell, but without spawning a subprocess. Returns the command output as a string.

run

run(workbook: string) → map

Executes a workbook: compiles its components, runs the DAG, returns the export map. Equivalent to the full compile+execute pipeline on a .org source.

done

done(result: string) → void

Terminates the agent loop and returns result as the agent's output. The LLM calls this when it has a final answer. The loop exits cleanly; no timeout is consumed.

Timeout

Agents use wall-clock TIMEOUT_MS (default: 60 000 ms) not an iteration cap. The loop is while not timed_out — it runs as many tool calls as the LLM wants within the time budget. Set TIMEOUT_MS in the run config to extend or restrict.

Events and telemetry

Every tool call is logged as a JSON event in _steps.jsonl:

{"step": 0, "tool": "shell", "args": {"command": "ls -la"}, "result": {"stdout": "...", "exit_code": 0}, "ms": 42}

wb telemetry reads _steps.jsonl files across all runs and produces a summary. The events array on the agent state is the in-memory view of the same data.

Invoking an agent

From the CLI:

wb run "list all workbooks in this directory"

From Elixir:

Workbooks.Agent.run(%{
  prompt: "list all workbooks",
  model: "xiaomi/mimo-v2.5",
  tenant: "dev",
  workdir: File.cwd!()
})

The engine.wit WIT world

The agent engine is itself a WIT component (engine.wasm). Its WIT world declares:

Imports (capabilities the host provides to the engine):

  • session-info — current tenant, model, workdir

  • vfs-query — VFS read/write

  • run-command — shell execution

  • browse-fetch — HTTP GET

Export:

  • run — the main entry point; takes prompt + config, returns result

This means the agent loop logic runs inside WASM, calling back into the Elixir host for any I/O. The host implements the imported interfaces; the engine implements the loop. This inversion keeps the loop logic portable and auditable.