Back to Skills

operating-kubernetes

verified

Operating production Kubernetes clusters effectively with resource management, advanced scheduling, networking, storage, security hardening, and autoscaling. Use when deploying workloads to Kubernetes, configuring cluster resources, implementing security policies, or troubleshooting operational issues.

View on GitHub

Marketplace

ai-design-components

ancoleman/ai-design-components

Plugin

backend-ai-skills

Repository

ancoleman/ai-design-components
153stars

skills/operating-kubernetes/SKILL.md

Last Verified

February 1, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/ancoleman/ai-design-components/blob/main/skills/operating-kubernetes/SKILL.md -a claude-code --skill operating-kubernetes

Installation paths:

Claude
.claude/skills/operating-kubernetes/
Powered by add-skill CLI

Instructions

# Kubernetes Operations

## Purpose

Operating Kubernetes clusters in production requires mastery of resource management, scheduling patterns, networking architecture, storage strategies, security hardening, and autoscaling. This skill provides operations-first frameworks for right-sizing workloads, implementing high-availability patterns, securing clusters with RBAC and Pod Security Standards, and systematically troubleshooting common failures.

Use this skill when deploying applications to Kubernetes, configuring cluster resources, implementing NetworkPolicies for zero-trust security, setting up autoscaling (HPA, VPA, KEDA), managing persistent storage, or diagnosing operational issues like CrashLoopBackOff or resource exhaustion.

## When to Use This Skill

**Common Triggers:**
- "Deploy my application to Kubernetes"
- "Configure resource requests and limits"
- "Set up autoscaling for my pods"
- "Implement NetworkPolicies for security"
- "My pod is stuck in Pending/CrashLoopBackOff"
- "Configure RBAC with least privilege"
- "Set up persistent storage for my database"
- "Spread pods across availability zones"

**Operations Covered:**
- Resource management (CPU/memory, QoS classes, quotas)
- Advanced scheduling (affinity, taints, topology spread)
- Networking (NetworkPolicies, Ingress, Gateway API)
- Storage operations (StorageClasses, PVCs, CSI)
- Security hardening (RBAC, Pod Security Standards, policies)
- Autoscaling (HPA, VPA, KEDA, cluster autoscaler)
- Troubleshooting (systematic debugging playbooks)

## Resource Management

### Quality of Service (QoS) Classes

Kubernetes assigns QoS classes based on resource requests and limits:

**Guaranteed (Highest Priority):**
- Requests equal limits for CPU and memory
- Never evicted unless exceeding limits
- Use for critical production services

```yaml
resources:
  requests:
    memory: "512Mi"
    cpu: "500m"
  limits:
    memory: "512Mi"  # Same as request
    cpu: "500m"
```

**Burstable (Medium Priority):**
- Re

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
13432 chars