Documentation / reasoning

Agent reasoning (reference)

How agents think, what extended thinking is, when to override.

Agents in Simulate think before they act. This page covers how that thinking is captured, when it is shown to the operator, and how to intervene at decision points where the engine asks for a call.

#Overview

Each actor set to agent mode is driven by a ClaudeAgent. Before every tick the agent:

Reads the scenario brief, the current game state, their own persona, and what other actors did last tick.
Reasons about what their organisation would do in character, weighing the non-obvious trade-offs before committing to a move.
Submits an action and a prose justification. The justification explains the reasoning in the organisation’s own voice.

The alternative to agent mode is ScriptedAgent, which follows a fixed decision tree without calling the model. ScriptedAgents are faster and deterministic; use them for actors whose behaviour you want to control precisely rather than simulate.

#Extended thinking

Extended thinking is enabled for actions where the trade-off is non-obvious. Instead of a single forward pass, the model runs an internal chain-of-thought (the “thinking” trace) before producing the final action. This produces more considered decisions at the cost of higher latency, typically 5-15 seconds per actor per tick.

Extended thinking is enabled when the deployment has WARGAME_CAPTURE_REASONING=1 set as an environment variable. When disabled, agents still reason and act but the thinking trace is not captured or displayed.

To read the reasoning, switch to the Cockpit view (press Tab). The reasoning trace there groups every actor’s submission by tick. Expand a tick to see each submission’s deliberation summary; expand a submission to read the full deliberation, the public statement, the raw action payload, and, when WARGAME_CAPTURE_REASONINGis on, the model’s verbose thinking chain.

#Arbiter resolutions

Some phases use an arbiter instead of a deterministic resolver. The arbiter reads what all actors submitted and writes a narrated outcome: who got what they wanted, who did not, and why. The arbiter is itself a ClaudeAgent with extended thinking.

The arbiter does not play for any actor. Its job is to adjudicate fairly given what each actor submitted and the scenario rules. For each arbiter phase it produces:

An outcome statement: what happened, factually.
A narrator note: what this means for the game state in narrative terms.
Resource adjustments: the number changes that follow from the outcome.

Arbiter phases appear as a distinct row in Cockpit view.

#Trust modes

The arbiter has three trust modes, controlled from the trust badge in the console header.

Ask. The arbiter pauses after every resolution and presents its outcome to the operator for confirmation before committing resource changes. Use when the arbiter is adjudicating high-stakes phases and you want to remain in the loop on every call.
Session. The operator confirmed once this session that they are comfortable with the arbiter proceeding without per-resolution confirmation. The arbiter runs uninterrupted; you see the outcomes in the transcript after each phase.
Always. The arbiter never pauses. Full automation. Use when you trust the setup and want the fastest possible run.

Trust mode is per-session, not per-scenario. It resets to “ask” at the start of each new browser session.

#Overriding an agent

Two override mechanisms are available from the console controls bar.

Override resource.Change an actor’s resource balance at the current tick. This does not change the actor’s persona or reasoning; it adjusts their starting position for the next tick. The agent will react to the changed game state in its next action using the same in-character reasoning.
Inject event. Schedule an external event at a future tick. Events affect game state for all actors, not just one. Use to model an exogenous shock the scenario author did not originally include.

Both overrides prompt for a reason note. This note goes into the audit log and appears in the audit-appendix.md file when you export. Any reader of the export can see exactly when and why the operator intervened.