|
liblloyal 1.0.0
Composable primitives for llama.cpp inference
|
Functions | |
| float * | get (llama_context *ctx, int32_t step=-1) |
| Get raw logits pointer (zero-copy) | |
|
inline |
Get raw logits pointer (zero-copy)
Returns a pointer to the internal llama.cpp logits buffer. This is a zero-copy operation - no data is copied.
| ctx | Llama context (must not be null) |
| step | Step index: -1 for last step (default), or specific step index |
| std::runtime_error | if ctx is null or logits unavailable |
IMPORTANT - Pointer Lifetime:
EXAMPLE: // After decode with logits=true float* logits = lloyal::logits::get(ctx); int n_vocab = lloyal::tokenizer::vocab_size(model);
// Compute entropy, sample, etc. - all synchronous float max_logit = *stdmax_element(logits, logits + n_vocab);
// After next decode(), logits pointer is INVALID await ctx.decode(next_tokens); // logits now points to different/stale data!
Definition at line 60 of file logits.hpp.