Class ContextPressure

Immutable KV budget snapshot for one tick of the agent loop

Created from SessionContext._storeKvPressure() which returns { nCtx, cellsUsed, remaining } where remaining = nCtx - cellsUsed. cellsUsed tracks unique KV cells per branch — incremented on decode_each / decode_scatter, decremented on release by position - fork_head (unique cells above the fork point), reset on bulk ops like retainOnly and drain.

Two thresholds partition remaining into three zones:

┌──────────────────────────────────────────────────────┐
│                    nCtx                              │
│  ┌──────────┬───────────────────┬──────────────────┐ │
│  │cellsUsed │    headroom > 0   │    softLimit     │ │
│  │ (in use) │   (new work OK)   │   (reserved)     │ │
│  └──────────┴───────────────────┴──────────────────┘ │
│              ◄── remaining ──►  │                    │
│                                 │                    │
│  headroom = remaining - softLimit                    │
│  critical = remaining < hardLimit                    │
└──────────────────────────────────────────────────────┘

headroom > 0 — room for new work (tool results, generation)
headroom ≤ 0 — over budget. SETTLE rejects tool results, PRODUCE hard-cuts non-terminal tool calls. Terminal tools still pass.
critical — remaining below hardLimit. Agents killed before produceSync() to prevent llama_decode crashes.

Index

Constructors

constructor

new ContextPressure(
ctx: SessionContext,
opts?: PressureThresholds,
): ContextPressure
Parameters
- ctx: SessionContext
- Optionalopts: PressureThresholds
Returns ContextPressure
- Defined in agents/src/agent-pool.ts:103

Properties

`Readonly`hardLimit

hardLimit: number

Crash-prevention floor — agents killed when remaining drops below

`Readonly`remaining

remaining: number

KV slots remaining (nCtx - cellsUsed). Infinity when nCtx ≤ 0 (no context limit).

`Readonly`softLimit

softLimit: number

Remaining KV floor — tokens reserved for downstream work

`Static` `Readonly`DEFAULT_HARD_LIMIT

DEFAULT_HARD_LIMIT: 128

Default hardLimit: 128 tokens crash-prevention floor

`Static` `Readonly`DEFAULT_SOFT_LIMIT

DEFAULT_SOFT_LIMIT: 1024

Default softLimit: 1024 tokens reserved for downstream work

Accessors

critical

get critical(): boolean
remaining < hardLimit — agent must not call produceSync().

Returns boolean
- Defined in agents/src/agent-pool.ts:118

headroom

get headroom(): number
Tokens available for new work: remaining - softLimit. Positive means room to accept tool results or continue generating. Negative means over budget — SETTLE rejects, PRODUCE hard-cuts.

Returns number
- Defined in agents/src/agent-pool.ts:115

Methods

canFit

canFit(tokenCount: number): boolean
Can tokenCount tokens fit while staying above softLimit?
Parameters
- tokenCount: number
Returns boolean
- Defined in agents/src/agent-pool.ts:121

Class ContextPressure

Index

Constructors

Properties

Accessors

Methods

Constructors

constructor

Parameters

Returns ContextPressure

Properties

`Readonly`hardLimit

`Readonly`remaining

`Readonly`softLimit

`Static` `Readonly`DEFAULT_HARD_LIMIT

`Static` `Readonly`DEFAULT_SOFT_LIMIT

Accessors

critical

Returns boolean

headroom

Returns number

Methods

canFit

Parameters

Returns boolean

Settings

On This Page

Class ContextPressure

Index

Constructors

Properties

Accessors

Methods

Constructors

constructor

Parameters

Returns ContextPressure

Properties

ReadonlyhardLimit

Readonlyremaining

ReadonlysoftLimit

Static ReadonlyDEFAULT_HARD_LIMIT

Static ReadonlyDEFAULT_SOFT_LIMIT

Accessors

critical

Returns boolean

headroom

Returns number

Methods

canFit

Parameters

Returns boolean

Settings

On This Page

`Readonly`hardLimit

`Readonly`remaining

`Readonly`softLimit

`Static` `Readonly`DEFAULT_HARD_LIMIT

`Static` `Readonly`DEFAULT_SOFT_LIMIT