Back to Skills

experiment-management

verified

Use this skill when setting up ML experiment infrastructure. Covers wandb/tensorboard integration, hydra/omegaconf configuration management, experiment reproducibility, and results visualization.

View on GitHub

Marketplace

everything-claude-code

yxbian23/ai-research-claude-code

Plugin

everything-claude-code

workflow

Repository

yxbian23/ai-research-claude-code

skills/experiment-management/SKILL.md

Last Verified

January 25, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/yxbian23/ai-research-claude-code/blob/main/skills/experiment-management/SKILL.md -a claude-code --skill experiment-management

Installation paths:

Claude
.claude/skills/experiment-management/
Powered by add-skill CLI

Instructions

# Experiment Management

This skill provides comprehensive guidance for managing machine learning experiments systematically.

## When to Activate

- Setting up experiment tracking
- Configuring hyperparameters with hydra/omegaconf
- Ensuring experiment reproducibility
- Analyzing and visualizing results
- Comparing multiple experiments

## Weights & Biases (wandb) Integration

### Basic Setup

```python
import wandb

# Initialize wandb run
wandb.init(
    project="my-research-project",
    name="exp-001-baseline",
    config={
        "learning_rate": 1e-4,
        "batch_size": 32,
        "epochs": 100,
        "model": "resnet50",
    },
    tags=["baseline", "v1"],
    notes="Initial baseline experiment",
)

# Access config
config = wandb.config
lr = config.learning_rate
```

### Logging Metrics

```python
# Log scalar metrics
wandb.log({
    "train/loss": train_loss,
    "train/accuracy": train_acc,
    "val/loss": val_loss,
    "val/accuracy": val_acc,
    "epoch": epoch,
    "lr": optimizer.param_groups[0]['lr'],
})

# Log with step
wandb.log({"loss": loss}, step=global_step)

# Log histograms
wandb.log({"gradients": wandb.Histogram(gradients)})

# Log images
wandb.log({"samples": [wandb.Image(img) for img in images]})

# Log tables
table = wandb.Table(columns=["id", "prediction", "target"])
for i, (pred, target) in enumerate(zip(predictions, targets)):
    table.add_data(i, pred, target)
wandb.log({"predictions": table})
```

### Model Checkpointing

```python
# Save model artifact
artifact = wandb.Artifact(
    name=f"model-{wandb.run.id}",
    type="model",
    description="Trained model checkpoint",
)
artifact.add_file("model.pt")
wandb.log_artifact(artifact)

# Load model artifact
artifact = wandb.use_artifact("model-abc123:latest")
artifact_dir = artifact.download()
model.load_state_dict(torch.load(f"{artifact_dir}/model.pt"))
```

### Hyperparameter Sweeps

```python
# sweep_config.yaml
sweep_config = {
    "method": "bayes",  # or "random", "grid"
  

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
9015 chars