|
liblloyal 1.0.0
Branched Inference for llama.cpp
|
Input item for decode::scatter — multiple tokens for one sequence. More...
#include <lloyal/decode.hpp>
Public Attributes | |
| std::span< const llama_token > | tokens |
| Token array (non-owning view) | |
| llama_pos | start_pos |
| KV cache position for first token. | |
| llama_seq_id | seq_id |
| Target sequence ID. | |
| bool | output_logits = false |
| When true, compute logits for last token in this run. | |
Input item for decode::scatter — multiple tokens for one sequence.
Uses std::span for a non-owning view of the token array. The span carries both pointer and length, eliminating raw-pointer + count mismatch bugs. An empty span (size 0) is valid and skipped by scatter().
Definition at line 277 of file decode.hpp.
| bool lloyal::decode::ScatterItem::output_logits = false |
When true, compute logits for last token in this run.
Definition at line 281 of file decode.hpp.
| llama_seq_id lloyal::decode::ScatterItem::seq_id |
Target sequence ID.
Definition at line 280 of file decode.hpp.
| llama_pos lloyal::decode::ScatterItem::start_pos |
KV cache position for first token.
Definition at line 279 of file decode.hpp.
| std::span<const llama_token> lloyal::decode::ScatterItem::tokens |
Token array (non-owning view)
Definition at line 278 of file decode.hpp.