OptionalhardCrash-prevention floor (tokens). When remaining drops below this,
agents are killed immediately before produceSync(). Prevents
llama_decode "no memory slot for batch" failures.
Default: 128
OptionalsoftRemaining KV floor for new work (tokens). When remaining drops below this, SETTLE rejects tool results, PRODUCE hard-cuts non-terminal tool calls, and INIT drops agents that don't fit.
Set to account for downstream pool needs (reporters, verification). Default: 1024
KV pressure thresholds controlling agent shutdown under context exhaustion
Two thresholds govern what happens as remaining KV shrinks:
softLimit (default 1024) — remaining KV floor for new work. Enforced at three points:
report()) still pass.Set to account for downstream pool needs (reporters, verification).
hardLimit (default 128) — crash-prevention floor. When remaining drops below this, agents are killed immediately before
produceSync(). Preventsllama_decode"no memory slot" failures. Pure safety net — should never be the primary budget control.