Back to Skills

advanced-guardrails

verified

LLM guardrails with NeMo, Guardrails AI, and OpenAI. Input/output rails, hallucination prevention, fact-checking, toxicity detection, red-teaming patterns. Use when building LLM guardrails, safety checks, or red-team workflows.

View on GitHub

Marketplace

orchestkit

yonatangross/skillforge-claude-plugin

Plugin

orchestkit-complete

development

Repository

yonatangross/skillforge-claude-plugin
33stars

./skills/advanced-guardrails/SKILL.md

Last Verified

January 23, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/yonatangross/skillforge-claude-plugin/blob/main/./skills/advanced-guardrails/SKILL.md -a claude-code --skill advanced-guardrails

Installation paths:

Claude
.claude/skills/advanced-guardrails/
Powered by add-skill CLI

Instructions

# Advanced Guardrails

Production LLM safety using NeMo Guardrails, Guardrails AI, and OpenAI moderation with red-teaming validation.

> **NeMo Guardrails 2026**: LangChain 1.x compatible, parallel rails execution, OpenTelemetry tracing. **DeepTeam**: 40+ vulnerabilities, OWASP Top 10 alignment.

## Overview

- Implementing input/output validation for LLM applications
- Preventing hallucinations and enforcing factuality
- Detecting and filtering toxic, harmful, or off-topic content
- Restricting LLM responses to specific domains/topics
- PII detection and redaction in LLM outputs
- Red-teaming and adversarial testing of LLM systems
- OWASP Top 10 for LLMs compliance

## Framework Comparison

| Framework | Best For | Key Features |
|-----------|----------|--------------|
| **NeMo Guardrails** | Programmable flows, Colang 2.0 | Input/output rails, fact-checking, dialog control |
| **Guardrails AI** | Validator-based, modular | 100+ validators, PII, toxicity, structured output |
| **OpenAI Guardrails** | Drop-in wrapper | Simple integration, moderation API |
| **DeepTeam** | Red teaming, adversarial | GOAT attacks, multi-turn jailbreaking, vulnerability scanning |

## Quick Reference

### NeMo Guardrails with Guardrails AI Integration

```yaml
# config.yml
models:
  - type: main
    engine: openai
    model: gpt-4o

rails:
  config:
    guardrails_ai:
      validators:
        - name: toxic_language
          parameters:
            threshold: 0.5
            validation_method: "sentence"
        - name: guardrails_pii
          parameters:
            entities: ["phone_number", "email", "ssn", "credit_card"]
        - name: restricttotopic
          parameters:
            valid_topics: ["technology", "support"]
        - name: valid_length
          parameters:
            min: 10
            max: 500

  input:
    flows:
      - guardrailsai check input $validator="guardrails_pii"
      - guardrailsai check input $validator="competitor_check"

  output:
    flows:
 

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
9108 chars