Back to Skills

using-deep-rl

verified

Routes to appropriate deep-RL skills based on problem type and algorithm family

View on GitHub

Marketplace

foundryside-marketplace

tachyon-beep/skillpacks

Plugin

yzmir-deep-rl

ai-ml

Repository

tachyon-beep/skillpacks
8stars

plugins/yzmir-deep-rl/skills/using-deep-rl/SKILL.md

Last Verified

January 24, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/tachyon-beep/skillpacks/blob/main/plugins/yzmir-deep-rl/skills/using-deep-rl/SKILL.md -a claude-code --skill using-deep-rl

Installation paths:

Claude
.claude/skills/using-deep-rl/
Powered by add-skill CLI

Instructions

# Using Deep RL Meta-Skill

## When to Use This Skill

Invoke this meta-skill when you encounter:

- **RL Implementation**: Implementing reinforcement learning algorithms (Q-learning, DQN, PPO, SAC, etc.)
- **Agent Training**: Training agents in environments (games, robotics, control systems)
- **Sequential Decision-Making**: Problems requiring learning from trial and error
- **Policy Optimization**: Learning policies that maximize cumulative rewards
- **RL Debugging**: Debugging training issues, agents not learning, reward problems

This is the **entry point** for the deep-rl pack. It routes to 13 specialized skills based on problem characteristics.

## How to Access Reference Sheets

**IMPORTANT**: All reference sheets are located in the SAME DIRECTORY as this SKILL.md file.

When this skill is loaded from:
  `skills/using-deep-rl/SKILL.md`

Reference sheets like `rl-foundations.md` are at:
  `skills/using-deep-rl/rl-foundations.md`

NOT at:
  `skills/rl-foundations.md` ← WRONG PATH

---

## Core Principle

**Problem type determines algorithm family.**

The correct approach depends on:

1. **Action Space**: Discrete (button presses) vs Continuous (joint angles)
2. **Data Regime**: Online (interact with environment) vs Offline (fixed dataset)
3. **Experience Level**: Need foundations vs ready to implement
4. **Special Requirements**: Multi-agent, model-based, exploration, reward design

**Always clarify the problem BEFORE suggesting algorithms.**

---

## The 13 Deep RL Skills

1. **rl-foundations** - MDP formulation, Bellman equations, value vs policy basics
2. **value-based-methods** - Q-learning, DQN, Double DQN, Dueling DQN, Rainbow
3. **policy-gradient-methods** - REINFORCE, PPO, TRPO, policy optimization
4. **actor-critic-methods** - A2C, A3C, SAC, TD3, advantage functions
5. **model-based-rl** - World models, Dyna, MBPO, planning with learned models
6. **offline-rl** - Batch RL, CQL, IQL, learning from fixed datasets
7. **multi-agent-rl** - MARL, cooperative

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
11462 chars