Back to Skills

increment-quality-judge-v2

verified

Enhanced AI-powered quality assessment with RISK SCORING (BMAD pattern) and quality gate decisions. Evaluates specifications, plans, and tests for clarity, testability, completeness, feasibility, maintainability, edge cases, and RISKS. Provides PASS/CONCERNS/FAIL decisions. Activates for validate quality, quality check, assess spec, evaluate increment, spec review, quality score, risk assessment, qa check, quality gate, /sw:qa command.

View on GitHub

Marketplace

specweave

anton-abyzov/specweave

Plugin

sw

development

Repository

anton-abyzov/specweave
27stars

plugins/specweave/skills/increment-quality-judge-v2/SKILL.md

Last Verified

January 25, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/anton-abyzov/specweave/blob/main/plugins/specweave/skills/increment-quality-judge-v2/SKILL.md -a claude-code --skill increment-quality-judge-v2

Installation paths:

Claude
.claude/skills/increment-quality-judge-v2/
Powered by add-skill CLI

Instructions

# Increment Quality Judge v2.0

**LLM-as-Judge Pattern Implementation**

AI-powered quality assessment using the **LLM-as-Judge** pattern - an established AI/ML evaluation technique where an LLM evaluates outputs with chain-of-thought reasoning, BMAD-pattern risk scoring, and formal quality gate decisions (PASS/CONCERNS/FAIL).

## LLM-as-Judge: What It Is

**LLM-as-Judge (LaaJ)** is a recognized pattern in AI/ML evaluation where a large language model assesses quality using structured reasoning.

```
┌─────────────────────────────────────────────────────────────┐
│                 LLM-as-Judge Pattern                        │
├─────────────────────────────────────────────────────────────┤
│  Input:  spec.md, plan.md, tasks.md                        │
│                                                             │
│  Process:                                                   │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ <thinking>                                          │   │
│  │   1. Read and understand the specification          │   │
│  │   2. Evaluate against 7 quality dimensions          │   │
│  │   3. Identify risks (P×I scoring)                   │   │
│  │   4. Form evidence-based verdict                    │   │
│  │ </thinking>                                         │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                             │
│  Output: Structured verdict with:                          │
│  • Dimension scores (0-100)                                │
│  • Risk assessment (CRITICAL/HIGH/MEDIUM/LOW)              │
│  • Quality gate decision (PASS/CONCERNS/FAIL)              │
│  • Actionable recommendations                              │
└─────────────────────────────────────────────────────────────┘
```

**Why LLM-as-Judge works:**
- **Consistency**: Uniform evaluation criteria without human fatigue
- **Reasoning**: Chain-of-thought explains WHY something is an

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
14869 chars