liblloyal 1.0.0
Branched Inference for llama.cpp
Loading...
Searching...
No Matches
chat_in.hpp File Reference

Chat Input Formatting with Full Format Awareness. More...

#include "common.hpp"
#include "tokenizer.hpp"
#include <llama/llama.h>
#include <chat.h>
#include <nlohmann/json.hpp>
#include <algorithm>
#include <exception>
#include <string>
#include <vector>

Go to the source code of this file.

Classes

struct  lloyal::chat_in::FormatInputs
 Input parameters for chat formatting. More...
 
struct  lloyal::chat_in::FormatResult
 Result from chat template formatting with full format awareness. More...
 

Namespaces

namespace  lloyal
 Boundary Tracker Stub for OSS liblloyal.
 
namespace  lloyal::chat_in
 Chat input formatting with full format awareness.
 

Functions

FormatResult lloyal::chat_in::format (const llama_model *model, const FormatInputs &inputs)
 Format chat messages using model's chat template with full format awareness.
 
bool lloyal::chat_in::validate (const std::string &template_str)
 Validate chat template syntax.
 
std::vector< llama_token > lloyal::chat_in::fallback_to_eog (const llama_model *model)
 Get EOG token as fallback when template parsing fails.
 
std::string lloyal::chat_in::get_token_safe (const llama_model *model, llama_token token)
 Get token text safely.
 
std::vector< llama_token > lloyal::chat_in::get_turn_separator (const llama_model *model)
 Get turn separator tokens for the model's chat template.
 

Detailed Description

Chat Input Formatting with Full Format Awareness.

Provides high-level chat template processing that passes through all format-awareness fields (tools, grammar, reasoning) and returns all output fields from common_chat_params. This enables callers to use format-aware grammar constraining and output parsing.

Architecture

Uses llama.cpp's common library (chat.h) for template processing:

  • common_chat_templates_init(): Initialize templates from model
  • common_chat_templates_apply(): Apply template with full inputs
  • common_chat_tools_parse_oaicompat(): Parse tool definitions
  • Graceful degradation when template processing fails
  • Turn separator extraction for warm prefill parity

Dependencies

  • common/chat.h: common_chat_templates_init, common_chat_templates_apply
  • tokenizer.hpp: tokenize(), detokenize(), is_eog()

Fallback Hierarchy

  1. template_override (if provided)
  2. Model's built-in template (llama_model_chat_template)
  3. ChatML template (default fallback in llama.cpp)
  4. Simple "role: content" format (last resort)
See also
common_chat_templates_init()
lloyal::tokenizer
lloyal::chat_out for output parsing

Definition in file chat_in.hpp.