liblloyal 1.0.0
Composable primitives for llama.cpp inference
Loading...
Searching...
No Matches
helpers.hpp File Reference

Helper Utilities. More...

#include "common.hpp"
#include "minja/chat-template.hpp"
#include "minja/minja.hpp"
#include <cassert>
#include <chrono>
#include <llama/ggml.h>
#include <llama/llama.h>
#include <lloyal/nlohmann/json.hpp>
#include <memory>
#include <sstream>
#include <string>
#include <vector>

Go to the source code of this file.

Classes

struct  lloyal::ChatTemplateResult
 Result from complete chat template processing. More...
 

Namespaces

namespace  lloyal
 JSON Schema to Grammar Converter (Header-Only)
 
namespace  lloyal::detail
 

Typedefs

using lloyal::json = nlohmann::ordered_json
 

Functions

std::string lloyal::detail::common_token_to_piece (const struct llama_vocab *vocab, llama_token token, bool special)
 
std::string lloyal::detail::get_token_safe (const llama_model *model, llama_token token)
 
const char * lloyal::detail::get_chatml_template ()
 
std::string lloyal::detail::apply_chat_template_helper (const std::string &template_str, const nlohmann::ordered_json &messages, const std::string &bos_token, const std::string &eos_token, bool add_generation_prompt, bool add_bos, bool add_eos)
 
void lloyal::batch_clear (llama_batch &batch)
 Clear batch to empty state.
 
void lloyal::batch_add (llama_batch &batch, llama_token id, int32_t pos, const std::vector< llama_seq_id > &seq_ids, bool logits, int32_t capacity=-1)
 Add single token to batch with position and sequence info.
 
std::string lloyal::format_chat_template_from_model (const llama_model *model, const std::string &messages_json, const std::string &template_override="")
 Format chat messages using model's built-in template.
 
std::vector< std::string > lloyal::extract_template_stop_tokens (const llama_model *model, const std::string &template_str)
 Dynamically detect stop tokens from chat template.
 
ChatTemplateResult lloyal::format_chat_template_complete (const llama_model *model, const std::string &messages_json, const std::string &template_override="")
 Complete chat template processing with stop token detection.
 
bool lloyal::validate_chat_template_helper (const std::string &template_str)
 Validate chat template syntax.
 
const std::vector< ggml_type > & lloyal::get_kv_cache_types ()
 Get list of supported KV cache types.
 
ggml_type lloyal::kv_cache_type_from_str (const std::string &s)
 Convert cache type string to ggml_type enum.
 
bool lloyal::is_truthy (const std::string &value)
 Check if string represents a truthy value.
 
bool lloyal::is_falsey (const std::string &value)
 Check if string represents a falsey value.
 
bool lloyal::is_autoy (const std::string &value)
 Check if string represents an auto value.
 
std::string lloyal::string_repeat (const std::string &str, size_t n)
 
std::string lloyal::string_join (const std::vector< std::string > &values, const std::string &separator)
 
std::vector< std::string > lloyal::string_split (const std::string &str, const std::string &delimiter)
 

Detailed Description

Helper Utilities.

Collection of utility functions for common llama.cpp operations:

  • Batch operations: Build and manage token batches for decoding
  • Chat template processing: Format messages, extract stop tokens, validate templates
  • Parameter conversion: KV cache type mapping, string validation helpers
  • String utilities: Repeat, join, split operations

Source: Vendored from llama.cpp/common/ License: MIT License - Copyright (c) 2023-2024 The ggml.ai team

Definition in file helpers.hpp.