Back to Skills

llm-safety-patterns

verified

Security patterns for LLM integrations including prompt injection defense and hallucination prevention. Use when implementing context separation, validating LLM outputs, or protecting against prompt injection attacks.

View on GitHub

Marketplace

orchestkit

yonatangross/orchestkit

Plugin

ork

development

Repository

yonatangross/orchestkit
33stars

skills/llm-safety-patterns/SKILL.md

Last Verified

January 25, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/yonatangross/orchestkit/blob/main/skills/llm-safety-patterns/SKILL.md -a claude-code --skill llm-safety-patterns

Installation paths:

Claude
.claude/skills/llm-safety-patterns/
Powered by add-skill CLI

Instructions

# LLM Safety Patterns

## The Core Principle

> **Identifiers flow AROUND the LLM, not THROUGH it.**
> **The LLM sees only content. Attribution happens deterministically.**

## Why This Matters

When identifiers appear in prompts, bad things happen:

1. **Hallucination:** LLM invents IDs that don't exist
2. **Confusion:** LLM mixes up which ID belongs where
3. **Injection:** Attacker manipulates IDs via prompt injection
4. **Leakage:** IDs appear in logs, caches, traces
5. **Cross-tenant:** LLM could reference other users' data

## The Architecture

```
┌─────────────────────────────────────────────────────────────────────────┐
│                                                                         │
│   SYSTEM CONTEXT (flows around LLM)                                     │
│   ┌─────────────────────────────────────────────────────────────────┐   │
│   │ user_id │ tenant_id │ analysis_id │ trace_id │ permissions     │   │
│   └─────────────────────────────────────────────────────────────────┘   │
│        │                                                       │        │
│        │                                                       │        │
│        ▼                                                       ▼        │
│   ┌─────────┐                                           ┌─────────┐    │
│   │ PRE-LLM │       ┌─────────────────────┐            │POST-LLM │    │
│   │ FILTER  │──────▶│        LLM          │───────────▶│ATTRIBUTE│    │
│   │         │       │                     │            │         │    │
│   │ Returns │       │ Sees ONLY:          │            │ Adds:   │    │
│   │ CONTENT │       │ - content text      │            │ - IDs   │    │
│   │ (no IDs)│       │ - context text      │            │ - refs  │    │
│   └─────────┘       │ (NO IDs!)           │            └─────────┘    │
│                     └─────────────────────┘                            │
│                                                                         │
└────────────

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
8343 chars