sft

# Supervised Fine-Tuning (SFT)

## Overview

SFT adapts a pre-trained LLM to follow instructions by training on instruction-response pairs. Unsloth provides an optimized SFTTrainer for 2x faster training with reduced memory usage. This skill includes patterns for training thinking/reasoning models.

## Quick Reference

| Component | Purpose |
|-----------|---------|
| `FastLanguageModel` | Load model with Unsloth optimizations |
| `SFTTrainer` | Trainer for instruction tuning |
| `SFTConfig` | Training hyperparameters |
| `dataset_text_field` | Column containing formatted text |
| Token ID 151668 | `</think>` boundary for Qwen3-Thinking models |

## Critical Environment Setup

```python
import os
from dotenv import load_dotenv
load_dotenv()

# Force text-based progress in Jupyter
os.environ["TQDM_NOTEBOOK"] = "false"
```

## Critical Import Order

```python
# CRITICAL: Import unsloth FIRST for proper TRL patching
import unsloth
from unsloth import FastLanguageModel, is_bf16_supported

# Then other imports
from trl import SFTTrainer, SFTConfig
from datasets import Dataset
import torch
```

**Warning**: Importing TRL before Unsloth will disable optimizations and may cause errors.

## Dataset Formats

### Instruction-Response Format

```python
dataset = [
    {"instruction": "What is Python?", "response": "A programming language."},
    {"instruction": "Explain ML.", "response": "Machine learning is..."},
]
```

### Chat/Conversation Format

```python
dataset = [
    {"messages": [
        {"role": "user", "content": "What is Python?"},
        {"role": "assistant", "content": "A programming language."}
    ]},
]
```

### Using Chat Templates

```python
def format_conversation(sample):
    messages = [
        {"role": "user", "content": sample["instruction"]},
        {"role": "assistant", "content": sample["response"]}
    ]
    return {"text": tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=False
    )}

dataset = dataset.map(f
Marketplace

Plugin

Repository

Last Verified

Install Skill

Instructions

Validation Details