liblloyal 1.0.0
Branched Inference for llama.cpp
Loading...
Searching...
No Matches
lloyal::branch::KvPressure Struct Reference

Snapshot of KV cache pressure from BranchStore. More...

#include <lloyal/branch.hpp>

Public Attributes

uint32_t n_ctx
 Total KV capacity.
 
uint32_t cells_used
 Cells allocated since last reset.
 
uint32_t remaining
 n_ctx - cells_used (clamped to 0)
 

Detailed Description

Snapshot of KV cache pressure from BranchStore.

cells_used is incremented on every decode (decode_each, decode_scatter, add_cells_used) and reset to zero on bulk operations: drain(), init_tenancy(), and when the last active branch is released. retainOnly() resets it to the surviving branch's position.

Conservative: overcounts if individual branches are pruned mid-run (prune does NOT decrement), which is safe — it triggers soft limits sooner rather than later.

Definition at line 114 of file branch.hpp.

Member Data Documentation

◆ cells_used

uint32_t lloyal::branch::KvPressure::cells_used

Cells allocated since last reset.

Examples
/home/runner/work/liblloyal/liblloyal/include/lloyal/branch.hpp.

Definition at line 116 of file branch.hpp.

◆ n_ctx

uint32_t lloyal::branch::KvPressure::n_ctx

Total KV capacity.

Examples
/home/runner/work/liblloyal/liblloyal/include/lloyal/branch.hpp.

Definition at line 115 of file branch.hpp.

◆ remaining

uint32_t lloyal::branch::KvPressure::remaining

n_ctx - cells_used (clamped to 0)

Examples
/home/runner/work/liblloyal/liblloyal/include/lloyal/branch.hpp.

Definition at line 117 of file branch.hpp.


The documentation for this struct was generated from the following file: