|
liblloyal 1.0.0
Branched Inference for llama.cpp
|
Snapshot of KV cache pressure from BranchStore. More...
#include <lloyal/branch.hpp>
Public Attributes | |
| uint32_t | n_ctx |
| Total KV capacity. | |
| uint32_t | cells_used |
| Cells allocated since last reset. | |
| uint32_t | remaining |
| n_ctx - cells_used (clamped to 0) | |
Snapshot of KV cache pressure from BranchStore.
cells_used is incremented on every decode (decode_each, decode_scatter, add_cells_used) and reset to zero on bulk operations: drain(), init_tenancy(), and when the last active branch is released. retainOnly() resets it to the surviving branch's position.
Conservative: overcounts if individual branches are pruned mid-run (prune does NOT decrement), which is safe — it triggers soft limits sooner rather than later.
Definition at line 114 of file branch.hpp.
| uint32_t lloyal::branch::KvPressure::cells_used |
Cells allocated since last reset.
Definition at line 116 of file branch.hpp.
| uint32_t lloyal::branch::KvPressure::n_ctx |
Total KV capacity.
Definition at line 115 of file branch.hpp.
| uint32_t lloyal::branch::KvPressure::remaining |
n_ctx - cells_used (clamped to 0)
Definition at line 117 of file branch.hpp.