Back to Skills

sft

verified

Supervised Fine-Tuning with SFTTrainer and Unsloth. Covers dataset preparation, chat template formatting, training configuration, and Unsloth optimizations for 2x faster instruction tuning. Includes thinking model patterns.

View on GitHub

Marketplace

bazzite-ai-plugins

atrawog/bazzite-ai-plugins

Plugin

bazzite-ai-jupyter

development

Repository

atrawog/bazzite-ai-plugins

bazzite-ai-jupyter/skills/sft/SKILL.md

Last Verified

January 21, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/atrawog/bazzite-ai-plugins/blob/main/bazzite-ai-jupyter/skills/sft/SKILL.md -a claude-code --skill sft

Installation paths:

Claude
.claude/skills/sft/
Powered by add-skill CLI

Instructions

# Supervised Fine-Tuning (SFT)

## Overview

SFT adapts a pre-trained LLM to follow instructions by training on instruction-response pairs. Unsloth provides an optimized SFTTrainer for 2x faster training with reduced memory usage. This skill includes patterns for training thinking/reasoning models.

## Quick Reference

| Component | Purpose |
|-----------|---------|
| `FastLanguageModel` | Load model with Unsloth optimizations |
| `SFTTrainer` | Trainer for instruction tuning |
| `SFTConfig` | Training hyperparameters |
| `dataset_text_field` | Column containing formatted text |
| Token ID 151668 | `</think>` boundary for Qwen3-Thinking models |

## Critical Environment Setup

```python
import os
from dotenv import load_dotenv
load_dotenv()

# Force text-based progress in Jupyter
os.environ["TQDM_NOTEBOOK"] = "false"
```

## Critical Import Order

```python
# CRITICAL: Import unsloth FIRST for proper TRL patching
import unsloth
from unsloth import FastLanguageModel, is_bf16_supported

# Then other imports
from trl import SFTTrainer, SFTConfig
from datasets import Dataset
import torch
```

**Warning**: Importing TRL before Unsloth will disable optimizations and may cause errors.

## Dataset Formats

### Instruction-Response Format

```python
dataset = [
    {"instruction": "What is Python?", "response": "A programming language."},
    {"instruction": "Explain ML.", "response": "Machine learning is..."},
]
```

### Chat/Conversation Format

```python
dataset = [
    {"messages": [
        {"role": "user", "content": "What is Python?"},
        {"role": "assistant", "content": "A programming language."}
    ]},
]
```

### Using Chat Templates

```python
def format_conversation(sample):
    messages = [
        {"role": "user", "content": sample["instruction"]},
        {"role": "assistant", "content": sample["response"]}
    ]
    return {"text": tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=False
    )}

dataset = dataset.map(f

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
9919 chars