aoti-debug

# AOTI Debugging Guide

This skill helps diagnose and fix common AOTInductor issues.

## First Step: Always Check Device and Shape Matching

**For ANY AOTI error (segfault, exception, crash, wrong output), ALWAYS check these first:**

1. **Compile device == Load device**: The model must be loaded on the same device type it was compiled on
2. **Input devices match**: Runtime inputs must be on the same device as the compiled model
3. **Input shapes match**: Runtime input shapes must match the shapes used during compilation (or satisfy dynamic shape constraints)

```python
# During compilation - note the device and shapes
model = MyModel().eval()           # What device? CPU or .cuda()?
inp = torch.randn(2, 10)           # What device? What shape?
compiled_so = torch._inductor.aot_compile(model, (inp,))

# During loading - device type MUST match compilation
loaded = torch._export.aot_load(compiled_so, "???")  # Must match model/input device above

# During inference - device and shapes MUST match
out = loaded(inp.to("???"))  # Must match compile device, shape must match
```

**If any of these don't match, you will get errors ranging from segfaults to exceptions to wrong outputs.**

## Key Constraint: Device Type Matching

**AOTI requires compile and load to use the same device type.**

- If you compile on CUDA, you must load on CUDA (device index can differ)
- If you compile on CPU, you must load on CPU
- Cross-device loading (e.g., compile on GPU, load on CPU) is NOT supported

## Common Error Patterns

### 1. Device Mismatch Segfault

**Symptom**: Segfault, exception, or crash during `aot_load()` or model execution.

**Example error messages**:
- `The specified pointer resides on host memory and is not registered with any CUDA device`
- Crash during constant loading in AOTInductorModelBase
- `Expected out tensor to have device cuda:0, but got cpu instead`

**Cause**: Compile and load device types don't match (see "First Step" above).

**Solution**: Ensure compile and
Repository

Last Verified

Install Skill

Instructions

Validation Details