Back to Skills

peft

verified

Parameter-efficient fine-tuning with LoRA and Unsloth. Covers LoraConfig, target module selection, QLoRA for 4-bit training, adapter merging, and Unsloth optimizations for 2x faster training.

View on GitHub

Marketplace

bazzite-ai-plugins

atrawog/bazzite-ai-plugins

Plugin

bazzite-ai-jupyter

development

Repository

atrawog/bazzite-ai-plugins

bazzite-ai-jupyter/skills/peft/SKILL.md

Last Verified

January 21, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/atrawog/bazzite-ai-plugins/blob/main/bazzite-ai-jupyter/skills/peft/SKILL.md -a claude-code --skill peft

Installation paths:

Claude
.claude/skills/peft/
Powered by add-skill CLI

Instructions

# Parameter-Efficient Fine-Tuning (PEFT)

## Overview

PEFT methods like LoRA train only a small number of adapter parameters instead of the full model, reducing memory by 10-100x while maintaining quality.

## Quick Reference

| Method | Memory | Speed | Quality |
|--------|--------|-------|---------|
| Full Fine-tune | High | Slow | Best |
| LoRA | Low | Fast | Very Good |
| QLoRA | Very Low | Fast | Good |
| Unsloth | Very Low | 2x Faster | Good |

## LoRA Concepts

### How LoRA Works

```
Original weight matrix W (frozen):     d x k
LoRA adapters A and B:                 d x r, r x k (where r << min(d,k))

Forward pass:
  output = x @ W + x @ A @ B * (alpha / r)

Trainable params: 2 * r * d  (instead of d * k)
```

### Memory Savings

```python
def lora_savings(d, k, r):
    original = d * k
    lora = 2 * r * max(d, k)
    reduction = (1 - lora / original) * 100
    return reduction

# Example: 4096 x 4096 matrix with rank 8
print(f"Memory reduction: {lora_savings(4096, 4096, 8):.1f}%")
# Output: ~99.6% reduction
```

## Basic LoRA Setup

### Configure LoRA

```python
from peft import LoraConfig, get_peft_model, TaskType

lora_config = LoraConfig(
    r=8,                          # Rank (capacity)
    lora_alpha=16,                # Scaling factor
    target_modules=["q_proj", "v_proj"],  # Which layers
    lora_dropout=0.05,            # Regularization
    bias="none",                  # Don't train biases
    task_type=TaskType.CAUSAL_LM  # Task type
)
```

### Apply to Model

```python
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(
    "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    device_map="auto"
)

model = get_peft_model(model, lora_config)

# Check trainable parameters
model.print_trainable_parameters()
# Output: trainable params: 4,194,304 || all params: 1,100,048,384 || trainable%: 0.38%
```

## LoRA Parameters

### Key Parameters

| Parameter | Values | Effect |
|-----------|--------|--------|
| `r` | 4, 8, 1

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
10047 chars