Guides you through implementing a research paper step-by-step from scratch. Use when asked to implement a paper, code up a paper, reproduce research results, or build a model from a paper. Focuses on building understanding through implementation with checkpoint questions.
View on GitHubskills/implement-paper-from-scratch/SKILL.md
February 1, 2026
Select agents to install to:
npx add-skill https://github.com/GhostScientist/skills/blob/main/skills/implement-paper-from-scratch/SKILL.md -a claude-code --skill implement-paper-from-scratchInstallation paths:
.claude/skills/implement-paper-from-scratch/# Implement Paper From Scratch The best way to truly understand a paper is to implement it. This skill guides you through that process methodically. ## Philosophy - **No copy-pasting from reference implementations** - We build understanding, not just working code - **Checkpoint questions verify understanding** - You should be able to answer "why" at each step - **Minimal dependencies** - Use NumPy/PyTorch fundamentals, not high-level wrappers - **Deliberate debugging** - Bugs are learning opportunities, not obstacles ## Process ### Phase 1: Pre-Implementation Analysis Before writing any code: 1. **Identify the core algorithm** - Strip away ablations, extensions, bells and whistles. What's the minimal version? 2. **List the components** - Break into modules: - Data pipeline - Model architecture - Loss function(s) - Training loop - Evaluation metrics 3. **Find the tricky parts** - What's non-obvious? - Custom layers or operations - Numerical stability concerns - Hyperparameter sensitivity - Implementation details buried in appendices 4. **Gather reference numbers** - What should we expect? - Training loss trajectory - Validation metrics at convergence - Compute requirements (if stated) ### Phase 2: Scaffolded Implementation Build up the implementation in this order: #### Step 1: Data ```python # Start with synthetic/toy data # Verify shapes and types before touching real data ``` **Checkpoint:** Can you describe what each tensor represents and its expected shape? #### Step 2: Model Architecture ```python # Build layer by layer # Print shapes at each stage # Verify parameter counts match paper ``` **Checkpoint:** If you randomly initialize and do a forward pass, do the output shapes match what the paper describes? #### Step 3: Loss Function ```python # Implement exactly as described # Test with known inputs/outputs # Check gradient flow ``` **Checkpoint:** Can you explain each term in the loss and why it's there? #