liblloyal 1.0.0
Branched Inference for llama.cpp
Loading...
Searching...
No Matches
logits.hpp File Reference

Zero-copy logits access with clear lifetime semantics. More...

#include <llama/llama.h>
#include <cstring>
#include <span>
#include <stdexcept>
#include <string>
#include <vector>
#include "decode.hpp"
#include "kv.hpp"

Go to the source code of this file.

Namespaces

namespace  lloyal
 Boundary Tracker Stub for OSS liblloyal.
 
namespace  lloyal::logits
 

Functions

float * lloyal::logits::get (llama_context *ctx, int32_t index=-1)
 
void lloyal::logits::process_chunks (llama_context *ctx, const std::vector< std::span< const llama_token > > &prompts, std::vector< float * > &output, int32_t n_vocab)
 Process arbitrary number of complete prompts for logit extraction.
 

Detailed Description

Zero-copy logits access with clear lifetime semantics.

Provides safe wrapper around llama_get_logits_ith() with:

  • Null checking and error handling
  • Clear documentation of pointer lifetime
  • Consistent error messages

LIFETIME CONTRACT: The returned pointer is valid ONLY until the next decode()/encode() call. Shells are responsible for implementing their own safety mechanisms (e.g., buffer detachment, reference tracking) to prevent use-after-invalidation.

USAGE: float* logits = lloyal::logits::get(ctx); int n_vocab = lloyal::tokenizer::vocab_size(model); // Use logits[0..n_vocab-1] synchronously // DO NOT store across decode() calls

Definition in file logits.hpp.