Parameter-efficient fine-tuning with LoRA and Unsloth. Covers LoraConfig, target module selection, QLoRA for 4-bit training, adapter merging, and Unsloth optimizations for 2x faster training.
View on GitHubatrawog/bazzite-ai-plugins
bazzite-ai-jupyter
bazzite-ai-jupyter/skills/peft/SKILL.md
January 21, 2026
Select agents to install to:
npx add-skill https://github.com/atrawog/bazzite-ai-plugins/blob/main/bazzite-ai-jupyter/skills/peft/SKILL.md -a claude-code --skill peftInstallation paths:
.claude/skills/peft/# Parameter-Efficient Fine-Tuning (PEFT)
## Overview
PEFT methods like LoRA train only a small number of adapter parameters instead of the full model, reducing memory by 10-100x while maintaining quality.
## Quick Reference
| Method | Memory | Speed | Quality |
|--------|--------|-------|---------|
| Full Fine-tune | High | Slow | Best |
| LoRA | Low | Fast | Very Good |
| QLoRA | Very Low | Fast | Good |
| Unsloth | Very Low | 2x Faster | Good |
## LoRA Concepts
### How LoRA Works
```
Original weight matrix W (frozen): d x k
LoRA adapters A and B: d x r, r x k (where r << min(d,k))
Forward pass:
output = x @ W + x @ A @ B * (alpha / r)
Trainable params: 2 * r * d (instead of d * k)
```
### Memory Savings
```python
def lora_savings(d, k, r):
original = d * k
lora = 2 * r * max(d, k)
reduction = (1 - lora / original) * 100
return reduction
# Example: 4096 x 4096 matrix with rank 8
print(f"Memory reduction: {lora_savings(4096, 4096, 8):.1f}%")
# Output: ~99.6% reduction
```
## Basic LoRA Setup
### Configure LoRA
```python
from peft import LoraConfig, get_peft_model, TaskType
lora_config = LoraConfig(
r=8, # Rank (capacity)
lora_alpha=16, # Scaling factor
target_modules=["q_proj", "v_proj"], # Which layers
lora_dropout=0.05, # Regularization
bias="none", # Don't train biases
task_type=TaskType.CAUSAL_LM # Task type
)
```
### Apply to Model
```python
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(
"TinyLlama/TinyLlama-1.1B-Chat-v1.0",
device_map="auto"
)
model = get_peft_model(model, lora_config)
# Check trainable parameters
model.print_trainable_parameters()
# Output: trainable params: 4,194,304 || all params: 1,100,048,384 || trainable%: 0.38%
```
## LoRA Parameters
### Key Parameters
| Parameter | Values | Effect |
|-----------|--------|--------|
| `r` | 4, 8, 1