llamacpp

# llama.cpp C API Guide

Comprehensive reference for the llama.cpp C API, documenting all non-deprecated functions and common usage patterns.

## Overview

llama.cpp is a C/C++ implementation for LLM inference with minimal dependencies and state-of-the-art performance. This skill provides:

- **Complete API Reference**: All non-deprecated functions organized by category
- **Common Workflows**: Working examples for typical use cases
- **Best Practices**: Patterns for efficient and correct API usage

## Quick Start

See **[references/workflows.md](references/workflows.md)** for complete working examples. Basic workflow:

1. `llama_backend_init()` - Initialize backend
2. `llama_model_load_from_file()` - Load model
3. `llama_init_from_model()` - Create context
4. `llama_tokenize()` - Convert text to tokens
5. `llama_decode()` - Process tokens
6. `llama_sampler_sample()` - Sample next token
7. Cleanup in reverse order

## When to Use This Skill

Use this skill when:

1. **API Lookup**: You need to find a specific function (e.g., "How do I load a model?", "What function creates a context?")
2. **Code Generation**: You're writing C code that uses llama.cpp
3. **Workflow Guidance**: You need to understand the steps for a task (e.g., text generation, embeddings, chat)
4. **Advanced Features**: You're working with batches, sequences, LoRA adapters, state management, or custom sampling
5. **Migration**: You're updating code from deprecated functions to current API

## Core Concepts

### Key Objects

- **`llama_model`**: Loaded model weights and architecture
- **`llama_context`**: Inference state (KV cache, compute buffers)
- **`llama_batch`**: Input tokens and positions for processing
- **`llama_sampler`**: Token sampling configuration
- **`llama_vocab`**: Vocabulary and tokenizer
- **`llama_memory_t`**: KV cache memory handle

### Typical Flow

1. **Initialize**: `llama_backend_init()`
2. **Load Model**: `llama_model_load_from_file()`
3. **Create Context**: `llama_init_from_m

Marketplace

Plugin

Repository

Last Verified

Install Skill

Instructions

Validation Details