lloyal-agents API Reference
    Preparing search index...

    Class ContextPressure

    Immutable KV budget snapshot for one tick of the agent loop

    Created from SessionContext._storeKvPressure() which returns { nCtx, cellsUsed, remaining } where remaining = nCtx - cellsUsed. cellsUsed tracks unique KV cells per branch — incremented on decode_each / decode_scatter, decremented on release by position - fork_head (unique cells above the fork point), reset on bulk ops like retainOnly and drain.

    Two thresholds partition remaining into three zones:

    ┌──────────────────────────────────────────────────────┐
    nCtx
    │ ┌──────────┬───────────────────┬──────────────────┐ │
    │ │cellsUsedheadroom > 0softLimit │ │
    │ │ (in use) │ (new work OK) │ (reserved) │ │
    │ └──────────┴───────────────────┴──────────────────┘ │
    │ ◄── remaining ──► │ │
    │ │ │
    headroom = remaining - softLimit
    critical = remaining < hardLimit
    └──────────────────────────────────────────────────────┘
    • headroom > 0 — room for new work (tool results, generation)
    • headroom ≤ 0 — over budget. SETTLE rejects tool results, PRODUCE hard-cuts non-terminal tool calls. Terminal tools still pass.
    • critical — remaining below hardLimit. Agents killed before produceSync() to prevent llama_decode crashes.
    Index

    Constructors

    Properties

    hardLimit: number

    Crash-prevention floor — agents killed when remaining drops below

    remaining: number

    KV slots remaining (nCtx - cellsUsed). Infinity when nCtx ≤ 0 (no context limit).

    softLimit: number

    Remaining KV floor — tokens reserved for downstream work

    DEFAULT_HARD_LIMIT: 128

    Default hardLimit: 128 tokens crash-prevention floor

    DEFAULT_SOFT_LIMIT: 1024

    Default softLimit: 1024 tokens reserved for downstream work

    Accessors

    • get headroom(): number

      Tokens available for new work: remaining - softLimit. Positive means room to accept tool results or continue generating. Negative means over budget — SETTLE rejects, PRODUCE hard-cuts.

      Returns number

    Methods

    • Can tokenCount tokens fit while staying above softLimit?

      Parameters

      • tokenCount: number

      Returns boolean