Functions
llama_token	greedy (llama_context ctx, const llama_vocab vocab)
	Greedy sampling: Select token with highest probability.

template<SamplingParamsLike P>
llama_token	sample_with_params (llama_context ctx, const llama_vocab vocab, const P &params, llama_sampler *grammarSampler=nullptr)
	Sample with configurable parameters (template accepts any SamplingParams type)

llama_token	greedy (llama_context ctx, const llama_model model)
	Greedy sampling with automatic vocab extraction.

template<SamplingParamsLike P>
llama_token	sample_with_params (llama_context ctx, const llama_model model, const P &params, llama_sampler *grammarSampler=nullptr)
	Parameterized sampling with automatic vocab extraction.

template<SamplingParamsLike P>
llama_sampler *	create_chain (const P &params)
	Create a persistent sampler chain from parameters.

llama_sampler *	clone_chain (llama_sampler *chain)
	Clone a sampler chain.

void	reseed_chain (llama_sampler *chain, uint32_t new_seed)
	Reseed the dist sampler in a chain.

void	free_chain (llama_sampler *chain)
	Free a sampler chain.

void	apply (llama_sampler chain, llama_token_data_array cur_p)
	Apply a sampler chain to a candidate array.

void	accept (llama_sampler *chain, llama_token token)
	Accept a token into the sampler chain.

Function Documentation

◆ accept()

void lloyal::sampler::accept	(	llama_sampler *	chain,
		llama_token	token
	)

inline

Accept a token into the sampler chain.

Updates internal state (e.g., penalty tracking).

Parameters

chain	Sampler chain
token	Token to accept

Examples: /home/runner/work/liblloyal/liblloyal/include/lloyal/branch.hpp.

Definition at line 595 of file sampler.hpp.

◆ apply()

void lloyal::sampler::apply	(	llama_sampler *	chain,
		llama_token_data_array *	cur_p
	)

inline

Apply a sampler chain to a candidate array.

Modifies candidates in-place, setting cur_p.selected to the chosen token.

Parameters

chain	Sampler chain to apply
cur_p	Candidate array (modified in-place)

Examples: /home/runner/work/liblloyal/liblloyal/include/lloyal/branch.hpp.

Definition at line 581 of file sampler.hpp.

◆ clone_chain()

llama_sampler * lloyal::sampler::clone_chain ( llama_sampler * chain )

inline

Clone a sampler chain.

Creates an independent copy of the chain with the same state. Used when forking branches in tree search.

Parameters

chain Source chain to clone

Returns: Owned clone - caller must free with free_chain()

Examples: /home/runner/work/liblloyal/liblloyal/include/lloyal/branch.hpp.

Definition at line 526 of file sampler.hpp.

◆ create_chain()

template<SamplingParamsLike P>

llama_sampler * lloyal::sampler::create_chain ( const P & params )

inline

Create a persistent sampler chain from parameters.

Builds a sampler chain that can be reused for multiple samples. Handles temperature <= 0 as greedy mode (deterministic argmax).

Parameters

params Sampling parameters (any SamplingParamsLike type)

Returns: Owned sampler chain - caller must free with free_chain()

Examples: /home/runner/work/liblloyal/liblloyal/include/lloyal/branch.hpp.

Definition at line 465 of file sampler.hpp.

◆ free_chain()

void lloyal::sampler::free_chain ( llama_sampler * chain )

inline

Free a sampler chain.

Parameters

chain Chain to free (safe to call with nullptr)

Examples: /home/runner/work/liblloyal/liblloyal/include/lloyal/branch.hpp.

Definition at line 567 of file sampler.hpp.

◆ greedy() [1/2]

llama_token lloyal::sampler::greedy	(	llama_context *	ctx,
		const llama_model *	model
	)

inline

Greedy sampling with automatic vocab extraction.

Convenience wrapper that handles vocab extraction from model. Selects the token with highest probability (argmax on logits).

Parameters

ctx	Llama context
model	Llama model

Returns: Token with highest probability

Definition at line 400 of file sampler.hpp.

◆ greedy() [2/2]

llama_token lloyal::sampler::greedy	(	llama_context *	ctx,
		const llama_vocab *	vocab
	)

inline

Greedy sampling: Select token with highest probability.

Uses llama_get_logits_ith(-1) to get last-step logits (requires logits=true in batch for that position). Performs argmax to find best token.

Parameters

ctx	Llama context (must have decoded at least one token with logits=true)
vocab	Vocabulary for size information

Returns: Token ID with highest probability

Exceptions

std::runtime_error if logits retrieval fails

IMPORTANT: Only works if decode batch had logits=true for last token. Decoder layer automatically sets this correctly.

Definition at line 110 of file sampler.hpp.

◆ reseed_chain()

void lloyal::sampler::reseed_chain	(	llama_sampler *	chain,
		uint32_t	new_seed
	)

inline

Reseed the dist sampler in a chain.

Used in tree search to ensure forked branches generate unique sequences. Removes the existing dist sampler (last in chain) and adds a new one.

Parameters

chain	Chain to reseed
new_seed	New seed for randomness

Definition at line 542 of file sampler.hpp.

◆ sample_with_params() [1/2]

template<SamplingParamsLike P>

llama_token lloyal::sampler::sample_with_params	(	llama_context *	ctx,
		const llama_model *	model,
		const P &	params,
		llama_sampler *	grammarSampler = `nullptr`
	)

inline

Parameterized sampling with automatic vocab extraction.

Convenience wrapper that handles vocab extraction from model. Supports temperature, top-k, top-p, min-p, and penalty parameters.

Parameters

ctx	Llama context
model	Llama model
params	Sampling parameters (any SamplingParamsLike type)
grammarSampler	Optional grammar constraint (default: nullptr)

Returns: Sampled token ID

Definition at line 429 of file sampler.hpp.

◆ sample_with_params() [2/2]

template<SamplingParamsLike P>

llama_token lloyal::sampler::sample_with_params	(	llama_context *	ctx,
		const llama_vocab *	vocab,
		const P &	params,
		llama_sampler *	grammarSampler = `nullptr`
	)

inline

Sample with configurable parameters (template accepts any SamplingParams type)

Supports full range of llama.cpp sampling strategies:

Temperature scaling
Top-k, top-p, min-p filtering
Repetition penalties (frequency, presence, repeat)
Grammar constraints (via persistent grammar sampler)

Parameters

ctx	Llama context (must have decoded at least one token with logits=true)
vocab	Vocabulary for token information
params	Sampling parameters (any type matching SamplingParamsLike concept)
grammarSampler	Optional persistent grammar sampler (managed by caller)

Returns: Sampled token ID

Exceptions

std::runtime_error if sampling fails

TEMPLATE INSTANTIATION: Works with any SamplingParams type matching the concept constraint. No adapters needed - uses duck typing + C++20 concepts.

Definition at line 179 of file sampler.hpp.

Functions

Function Documentation

◆ accept()

◆ apply()

◆ clone_chain()

◆ create_chain()

◆ free_chain()

◆ greedy() [1/2]

◆ greedy() [2/2]

◆ reseed_chain()

◆ sample_with_params() [1/2]

◆ sample_with_params() [2/2]