Supervised Fine-Tuning with SFTTrainer and Unsloth. Covers dataset preparation, chat template formatting, training configuration, and Unsloth optimizations for 2x faster instruction tuning. Includes thinking model patterns.
View on GitHubatrawog/bazzite-ai-plugins
bazzite-ai-jupyter
bazzite-ai-jupyter/skills/sft/SKILL.md
January 21, 2026
Select agents to install to:
npx add-skill https://github.com/atrawog/bazzite-ai-plugins/blob/main/bazzite-ai-jupyter/skills/sft/SKILL.md -a claude-code --skill sftInstallation paths:
.claude/skills/sft/# Supervised Fine-Tuning (SFT)
## Overview
SFT adapts a pre-trained LLM to follow instructions by training on instruction-response pairs. Unsloth provides an optimized SFTTrainer for 2x faster training with reduced memory usage. This skill includes patterns for training thinking/reasoning models.
## Quick Reference
| Component | Purpose |
|-----------|---------|
| `FastLanguageModel` | Load model with Unsloth optimizations |
| `SFTTrainer` | Trainer for instruction tuning |
| `SFTConfig` | Training hyperparameters |
| `dataset_text_field` | Column containing formatted text |
| Token ID 151668 | `</think>` boundary for Qwen3-Thinking models |
## Critical Environment Setup
```python
import os
from dotenv import load_dotenv
load_dotenv()
# Force text-based progress in Jupyter
os.environ["TQDM_NOTEBOOK"] = "false"
```
## Critical Import Order
```python
# CRITICAL: Import unsloth FIRST for proper TRL patching
import unsloth
from unsloth import FastLanguageModel, is_bf16_supported
# Then other imports
from trl import SFTTrainer, SFTConfig
from datasets import Dataset
import torch
```
**Warning**: Importing TRL before Unsloth will disable optimizations and may cause errors.
## Dataset Formats
### Instruction-Response Format
```python
dataset = [
{"instruction": "What is Python?", "response": "A programming language."},
{"instruction": "Explain ML.", "response": "Machine learning is..."},
]
```
### Chat/Conversation Format
```python
dataset = [
{"messages": [
{"role": "user", "content": "What is Python?"},
{"role": "assistant", "content": "A programming language."}
]},
]
```
### Using Chat Templates
```python
def format_conversation(sample):
messages = [
{"role": "user", "content": sample["instruction"]},
{"role": "assistant", "content": sample["response"]}
]
return {"text": tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=False
)}
dataset = dataset.map(f