aka. Agent Skills
Discover skills for AI coding agents. Works with Claude Code, OpenAI Codex, Gemini CLI, Cursor, and more.
Analyze AppWorld task failures to extract specific API patterns and generate actionable playbook bullets with concrete code examples
Generate Python code to solve AppWorld agent tasks using playbook bullet guidance. Use when the AppWorld executor needs executable Python code for tasks involving Spotify, Venmo, Gmail, Calendar, Contacts, or other AppWorld APIs.
Organize Grey Haven projects following standard structures for TanStack Start (frontend) and FastAPI (backend). Use when creating new projects, organizing files, or refactoring project layout.
This skill should be used when building React components with TypeScript, typing hooks, handling events, or when React TypeScript, React 19, Server Components are mentioned. Covers type-safe patterns for React 18-19 including generic components, proper event typing, and TanStack Router integration.
AI-powered intelligent debugging with stack trace analysis, error pattern recognition, and automated fix suggestions. Use when debugging complex errors, analyzing stack traces, or performing root cause analysis. Triggers: 'debug', 'error analysis', 'stack trace', 'root cause', 'troubleshooting'.
Comprehensive test suite generation with unit tests, integration tests, edge cases, and error handling. Use when generating tests for existing code, improving coverage, or creating systematic test suites. Triggers: 'generate tests', 'add tests', 'test coverage', 'write tests for', 'create test suite'.
Grey Haven's security best practices - input validation, output sanitization, multi-tenant RLS, secret management with Doppler, rate limiting, OWASP Top 10 for TanStack/FastAPI stack. Use when implementing security-critical features.
Get current time in any timezone and convert times between timezones. Use when working with time, dates, timezones, scheduling across regions, or when user mentions specific cities/regions for time queries. Supports IANA timezone names.
Format commit messages according to Grey Haven Studio's actual commitlint configuration (100 char header, lowercase subject, conventional commits). Use when creating git commits or reviewing commit messages.
Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.
This skill helps launch and configure the Chrome DevTools MCP server, giving Claude visual access to a live browser for debugging and automation. Use when the user asks to set up browser debugging, launch Chrome with DevTools, configure chrome-devtools-mcp, see what my app looks like, take screenshots of my web application, check the browser console, debug console errors, inspect network requests, analyse API responses, measure Core Web Vitals or page performance, run a Lighthouse audit, test button clicks or form submissions, automate browser interactions, fill out forms programmatically, simulate user actions, emulate mobile devices or slow networks, capture DOM snapshots, execute JavaScript in the browser, or troubleshoot Chrome DevTools MCP connection issues. Supports Windows, Linux, and WSL2 environments.
Implement observability and monitoring using Cloudflare Workers Analytics, wrangler tail for logs, and health checks. Use when setting up monitoring, implementing logging, configuring alerts, or debugging production issues.
DevOps and infrastructure troubleshooting for Cloudflare Workers, PlanetScale PostgreSQL, and distributed systems. Use when debugging deployment issues, infrastructure problems, connection errors, or performance degradation.
Comprehensive Google Analytics 4 guide covering property setup, events, custom events, recommended events, custom dimensions, user tracking, audiences, reporting, BigQuery integration, gtag.js implementation, GTM integration, Measurement Protocol, DebugView, privacy compliance, and data management. Use when working with GA4 implementation, tracking, analysis, or any GA4-related tasks.
This skill should be used when building AI features with Vercel AI SDK, using useChat, streamText, or generateObject, or when "AI SDK", "streaming chat", or "structured outputs" are mentioned.
Evaluate LLM outputs with multi-dimensional rubrics, handle non-determinism, and implement LLM-as-judge patterns. Essential for production LLM systems. Use when testing prompts, validating outputs, comparing models, or when user mentions 'evaluation', 'testing LLM', 'rubric', 'LLM-as-judge', 'output quality', 'prompt testing', or 'model comparison'.
Comprehensive performance analysis and optimization for algorithms (O(n²)→O(n)), databases (N+1 queries, indexes), React (memoization, virtual lists), bundles (code splitting), API caching, and memory leaks. 85%+ improvement rate. Use when application is slow, response times exceed SLA, high CPU/memory usage, performance budgets needed, or when user mentions 'performance', 'slow', 'optimization', 'bottleneck', 'speed up', 'latency', 'memory leak', or 'performance tuning'.
Use Microsoft Fabric CLI (fab) to manage workspaces, semantic models, reports, notebooks, lakehouses, and other Fabric resources via file-system metaphor and commands. Use when deploying Fabric items, running jobs, querying data, managing OneLake files, or automating Fabric operations. Invoke this skill automatically whenever a user mentions the Fabric CLI, fab, or Fabric.
Python Test-Driven Development expertise with pytest, strict red-green-refactor methodology, FastAPI testing patterns, and Pydantic model testing. Use when implementing Python features with TDD, writing pytest tests, testing FastAPI endpoints, developing with test-first approach, or when user mentions 'Python TDD', 'pytest', 'FastAPI testing', 'red-green-refactor', 'Python unit tests', 'test-driven Python', or 'Python test coverage'.
Master TDD orchestration with multi-agent coordination, strict red-green-refactor enforcement, automated test generation, coverage tracking, and >90% coverage quality gates. Coordinates tdd-python, tdd-typescript, and test-generator agents. Use when implementing features with TDD workflow, coordinating multiple TDD agents, enforcing test-first development, or when user mentions 'TDD workflow', 'test-first', 'TDD orchestration', 'multi-agent TDD', 'test coverage', or 'red-green-refactor'.