|
liblloyal 1.0.0
Branched Inference for llama.cpp
|
Reusable scratch buffers for multi-sequence batch construction. More...
#include <lloyal/decode.hpp>
Public Member Functions | |
| void | resize (int32_t n) |
| llama_batch | as_batch (int32_t n_tokens) |
| ABI-sensitive: writes llama_batch fields directly (no common_batch_* wrapper exists for external-buffer batches). | |
Public Attributes | |
| std::vector< llama_token > | tokens_ |
| std::vector< llama_pos > | pos_ |
| std::vector< int32_t > | n_seq_id_ |
| std::vector< llama_seq_id > | seq_id_single_ |
| std::vector< llama_seq_id * > | seq_id_ptrs_ |
| std::vector< int8_t > | logits_ |
Reusable scratch buffers for multi-sequence batch construction.
Holds pre-allocated vectors that back the llama_batch pointers. Reuse a single Scratch across calls to avoid per-decode allocation.
Definition at line 290 of file decode.hpp.
|
inline |
ABI-sensitive: writes llama_batch fields directly (no common_batch_* wrapper exists for external-buffer batches).
Audit on llama.cpp submodule bumps.
Definition at line 309 of file decode.hpp.
|
inline |
Definition at line 298 of file decode.hpp.
| std::vector<int8_t> lloyal::decode::Scratch::logits_ |
Definition at line 296 of file decode.hpp.
| std::vector<int32_t> lloyal::decode::Scratch::n_seq_id_ |
Definition at line 293 of file decode.hpp.
| std::vector<llama_pos> lloyal::decode::Scratch::pos_ |
Definition at line 292 of file decode.hpp.
| std::vector<llama_seq_id*> lloyal::decode::Scratch::seq_id_ptrs_ |
Definition at line 295 of file decode.hpp.
| std::vector<llama_seq_id> lloyal::decode::Scratch::seq_id_single_ |
Definition at line 294 of file decode.hpp.
| std::vector<llama_token> lloyal::decode::Scratch::tokens_ |
Definition at line 291 of file decode.hpp.