Security techniques and quality control for prompts and agents
View on GitHubFebruary 2, 2026
Select agents to install to:
npx add-skill https://github.com/fusengine/agents/blob/main/plugins/prompt-engineer/skills/guardrails/SKILL.md -a claude-code --skill guardrailsInstallation paths:
.claude/skills/guardrails/# Guardrails
Skill for implementing security guardrails and quality control.
## 4-Layer Security Architecture
```
┌─────────────────────────────────────────────────────┐
│ LAYER 1: Input │
│ - Harmlessness screen (lightweight LLM) │
│ - Pattern matching (jailbreak regex) │
│ - PII detection/redaction │
└─────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ LAYER 2: System │
│ - Ethical guardrails in system prompt │
│ - Explicit capability limits │
│ - Refusal instructions │
└─────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ LAYER 3: Output │
│ - Format validation │
│ - Hallucination detection │
│ - Compliance check │
└─────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ LAYER 4: Monitoring │
│ - Logs of all interactions │
│ - Alerts on suspicious patterns │
│ - Rate limiting per user │
└─────────────────────────────────────────────────────┘
```
## References
- [Input Guardrails](./references/input-guardrails.md) - Topical checks, jailbreak detection, PII redaction
- [Output Guardrails](./references/output-guardrails.md) - Format validation, hallucination detection, tool call validation
## Ethical Guardrails Template
```markdown
<<ethical_guardrails>>
You are bound by strict ethical and legal limits.
R