Security patterns for LLM integrations including prompt injection defense and hallucination prevention. Use when implementing context separation, validating LLM outputs, or protecting against prompt injection attacks.
View on GitHubSelect agents to install to:
npx add-skill https://github.com/yonatangross/orchestkit/blob/main/plugins/ork-llm/skills/llm-safety-patterns/SKILL.md -a claude-code --skill llm-safety-patternsInstallation paths:
.claude/skills/llm-safety-patterns/# LLM Safety Patterns ## The Core Principle > **Identifiers flow AROUND the LLM, not THROUGH it.** > **The LLM sees only content. Attribution happens deterministically.** ## Why This Matters When identifiers appear in prompts, bad things happen: 1. **Hallucination:** LLM invents IDs that don't exist 2. **Confusion:** LLM mixes up which ID belongs where 3. **Injection:** Attacker manipulates IDs via prompt injection 4. **Leakage:** IDs appear in logs, caches, traces 5. **Cross-tenant:** LLM could reference other users' data ## The Architecture ``` ┌─────────────────────────────────────────────────────────────────────────┐ │ │ │ SYSTEM CONTEXT (flows around LLM) │ │ ┌─────────────────────────────────────────────────────────────────┐ │ │ │ user_id │ tenant_id │ analysis_id │ trace_id │ permissions │ │ │ └─────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ │ │ ▼ ▼ │ │ ┌─────────┐ ┌─────────┐ │ │ │ PRE-LLM │ ┌─────────────────────┐ │POST-LLM │ │ │ │ FILTER │──────▶│ LLM │───────────▶│ATTRIBUTE│ │ │ │ │ │ │ │ │ │ │ │ Returns │ │ Sees ONLY: │ │ Adds: │ │ │ │ CONTENT │ │ - content text │ │ - IDs │ │ │ │ (no IDs)│ │ - context text │ │ - refs │ │ │ └─────────┘ │ (NO IDs!) │ └─────────┘ │ │ └─────────────────────┘ │ │ │ └────────────