Back to Skills

ollama

verified

Ollama LLM inference server management via Podman Quadlet. Single-instance design with GPU acceleration for running local LLMs. Use when users need to configure Ollama, pull models, run inference, or manage the Ollama server.

View on GitHub

Marketplace

bazzite-ai-plugins

atrawog/bazzite-ai-plugins

Plugin

bazzite-ai

productivity

Repository

atrawog/bazzite-ai-plugins

bazzite-ai/skills/ollama/SKILL.md

Last Verified

January 21, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/atrawog/bazzite-ai-plugins/blob/main/bazzite-ai/skills/ollama/SKILL.md -a claude-code --skill ollama

Installation paths:

Claude
.claude/skills/ollama/
Powered by add-skill CLI

Instructions

# Ollama - Local LLM Inference Server

## Overview

The `ollama` command manages the Ollama LLM inference server using Podman Quadlet containers. It provides a single-instance server for running local LLMs with GPU acceleration.

**Key Concept:** Unlike Jupyter, Ollama uses a single-instance design because GPU memory is shared across all loaded models. The API is accessible at port 11434.

## Quick Reference

| Action | Command | Description |
|--------|---------|-------------|
| Config | `ujust ollama config [--port=...] [--gpu-type=...]` | Configure server |
| Start | `ujust ollama start` | Start server |
| Stop | `ujust ollama stop` | Stop server |
| Restart | `ujust ollama restart` | Restart server |
| Logs | `ujust ollama logs [--lines=...]` | View logs |
| Status | `ujust ollama status` | Show server status |
| Pull | `ujust ollama pull --model=<MODEL>` | Download a model |
| List | `ujust ollama list` | List installed models |
| Run | `ujust ollama run --model=<MODEL> [--prompt=...]` | Run model |
| Shell | `ujust ollama shell [-- CMD...]` | Open container shell |
| Delete | `ujust ollama delete` | Remove server and images |

## Parameters

| Parameter | Long Flag | Short | Default | Description |
|-----------|-----------|-------|---------|-------------|
| Port | `--port` | `-p` | `11434` | API port |
| GPU Type | `--gpu-type` | `-g` | `auto` | GPU type: `nvidia`, `amd`, `intel`, `none`, `auto` |
| Image | `--image` | `-i` | (default image) | Container image |
| Tag | `--tag` | `-t` | `stable` | Image tag |
| Config Dir | `--config-dir` | `-c` | `~/.config/ollama/1` | Config/data directory |
| Workspace | `--workspace-dir` | `-w` | (empty) | Optional mount to /workspace |
| Bind | `--bind` | `-b` | `127.0.0.1` | Bind address |
| Lines | `--lines` | `-l` | `50` | Log lines to show |
| Model | `--model` | `-m` | `qwen3:4b` | Model for pull/run actions |
| Prompt | `--prompt` | - | `say hi` | Prompt for run action |
| Context Length | `--context-length` | - | `8

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
7797 chars