Use this skill when setting up ML experiment infrastructure. Covers wandb/tensorboard integration, hydra/omegaconf configuration management, experiment reproducibility, and results visualization.
View on GitHubyxbian23/ai-research-claude-code
everything-claude-code
skills/experiment-management/SKILL.md
January 25, 2026
Select agents to install to:
npx add-skill https://github.com/yxbian23/ai-research-claude-code/blob/main/skills/experiment-management/SKILL.md -a claude-code --skill experiment-managementInstallation paths:
.claude/skills/experiment-management/# Experiment Management
This skill provides comprehensive guidance for managing machine learning experiments systematically.
## When to Activate
- Setting up experiment tracking
- Configuring hyperparameters with hydra/omegaconf
- Ensuring experiment reproducibility
- Analyzing and visualizing results
- Comparing multiple experiments
## Weights & Biases (wandb) Integration
### Basic Setup
```python
import wandb
# Initialize wandb run
wandb.init(
project="my-research-project",
name="exp-001-baseline",
config={
"learning_rate": 1e-4,
"batch_size": 32,
"epochs": 100,
"model": "resnet50",
},
tags=["baseline", "v1"],
notes="Initial baseline experiment",
)
# Access config
config = wandb.config
lr = config.learning_rate
```
### Logging Metrics
```python
# Log scalar metrics
wandb.log({
"train/loss": train_loss,
"train/accuracy": train_acc,
"val/loss": val_loss,
"val/accuracy": val_acc,
"epoch": epoch,
"lr": optimizer.param_groups[0]['lr'],
})
# Log with step
wandb.log({"loss": loss}, step=global_step)
# Log histograms
wandb.log({"gradients": wandb.Histogram(gradients)})
# Log images
wandb.log({"samples": [wandb.Image(img) for img in images]})
# Log tables
table = wandb.Table(columns=["id", "prediction", "target"])
for i, (pred, target) in enumerate(zip(predictions, targets)):
table.add_data(i, pred, target)
wandb.log({"predictions": table})
```
### Model Checkpointing
```python
# Save model artifact
artifact = wandb.Artifact(
name=f"model-{wandb.run.id}",
type="model",
description="Trained model checkpoint",
)
artifact.add_file("model.pt")
wandb.log_artifact(artifact)
# Load model artifact
artifact = wandb.use_artifact("model-abc123:latest")
artifact_dir = artifact.download()
model.load_state_dict(torch.load(f"{artifact_dir}/model.pt"))
```
### Hyperparameter Sweeps
```python
# sweep_config.yaml
sweep_config = {
"method": "bayes", # or "random", "grid"