Back to Skills

qlora

verified

Advanced QLoRA experiments and comparisons. Covers alpha scaling, LoRA rank selection, target module strategies, continual learning, multi-adapter hot-swapping, and quantization comparison (4-bit vs BF16).

View on GitHub

Marketplace

bazzite-ai-plugins

atrawog/bazzite-ai-plugins

Plugin

bazzite-ai-jupyter

development

Repository

atrawog/bazzite-ai-plugins

bazzite-ai-jupyter/skills/qlora/SKILL.md

Last Verified

January 21, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/atrawog/bazzite-ai-plugins/blob/main/bazzite-ai-jupyter/skills/qlora/SKILL.md -a claude-code --skill qlora

Installation paths:

Claude
.claude/skills/qlora/
Powered by add-skill CLI

Instructions

# Advanced QLoRA Experiments

## Overview

This skill covers advanced QLoRA experimentation patterns for optimizing fine-tuning performance. Learn how to select the best LoRA rank, alpha scaling, target modules, and quantization settings for your specific use case.

## Quick Reference

| Topic | Key Finding |
|-------|-------------|
| **Rank (r)** | r=16 is optimal balance; r=8 for memory constrained |
| **Alpha** | alpha=r (1.0x scaling) is standard; alpha=2r for aggressive |
| **Target Modules** | all_linear for general; mlp_only for knowledge injection |
| **Quantization** | 4-bit NF4 matches BF16 quality with 11-15% memory savings |
| **Continual Learning** | Sequential training adds knowledge without forgetting |
| Token ID 151668 | `</think>` boundary for Qwen3-Thinking models |

## Critical Environment Setup

```python
import os
from dotenv import load_dotenv
load_dotenv()

# Force text-based progress in Jupyter
os.environ["TQDM_NOTEBOOK"] = "false"

# CRITICAL: Import unsloth FIRST
import unsloth
from unsloth import FastLanguageModel, is_bf16_supported
```

## Alpha Scaling

### Formula

The effective LoRA scaling factor is:

```
scaling_factor = alpha / r
```

This acts as a learning rate multiplier for adapter weights.

### Alpha Comparison Code

```python
import unsloth
from unsloth import FastLanguageModel, is_bf16_supported
from trl import SFTTrainer, SFTConfig
from transformers import TrainerCallback

ALPHAS = [8, 16, 32, 64]
FIXED_RANK = 16
results = []

for alpha in ALPHAS:
    scaling_factor = alpha / FIXED_RANK
    print(f"\n=== Testing alpha={alpha} (scaling={scaling_factor}x) ===")

    # Load fresh model
    model, tokenizer = FastLanguageModel.from_pretrained(
        "unsloth/Qwen3-4B-Thinking-2507-unsloth-bnb-4bit",
        max_seq_length=512,
        load_in_4bit=True,
    )

    # Apply LoRA with specific alpha
    model = FastLanguageModel.get_peft_model(
        model,
        r=FIXED_RANK,
        lora_alpha=alpha,  # Variable alpha
    

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
15319 chars