gpu-architecture-advisor

# GPU Architecture Advisor Skill

This skill provides architecture-specific optimization guidance for NVIDIA GPUs.

## When I activate

I automatically activate when you:

- Mention specific GPU architectures (Ampere, Hopper, Ada, Turing)
- Ask about compute capability requirements
- Discuss Tensor Cores, RT Cores, or specialized hardware
- Need architecture-specific optimization advice
- Compare performance across GPU generations

## GPU Architectures I Know

### Hopper (Compute 9.0)

- **Tensor Cores**: 4th gen, FP8 support, Transformer Engine
- **Thread Block Clusters**: New hierarchy level
- **DPX instructions**: Dynamic programming acceleration
- **Async execution**: Enhanced asynchronous pipeline
- **L2 cache**: Larger, more configurable
- **Target GPUs**: H100, H200

### Ada Lovelace (Compute 8.9)

- **Tensor Cores**: 4th gen with FP8
- **Shader Execution Reordering**: Dynamic scheduling
- **DLSS 3**: Optical flow acceleration
- **RT Cores**: 3rd gen ray tracing
- **Target GPUs**: RTX 4090, RTX 4080, L40

### Ampere (Compute 8.0, 8.6)

- **Tensor Cores**: 3rd gen, TF32, BF16 support
- **Unified memory**: Improved page migration
- **Multi-instance GPU**: Hardware partitioning
- **Async copy**: Dedicated copy engines
- **Target GPUs**: A100, A30, RTX 3090, RTX 3080

### Turing (Compute 7.5)

- **Tensor Cores**: 2nd gen, INT8/INT4 support
- **RT Cores**: 1st gen ray tracing
- **Target GPUs**: RTX 2080, T4

## What I provide

### Architecture-Specific Optimization

I suggest optimizations leveraging:

- Tensor Core operations (WMMA, CUTLASS)
- Optimal warp sizes for architecture
- Cache hierarchy utilization
- Compute capability-specific features
- Memory bandwidth characteristics

### Code Examples

```cuda
// Ampere+ TF32 automatic conversion
#if __CUDA_ARCH__ >= 800
// TF32 mode automatically accelerates FP32 on Tensor Cores
#endif

// Hopper async pipeline
#if __CUDA_ARCH__ >= 900
cuda::pipeline<cuda::thread_scope_thread> pipe;
#endif
```

### Feature Detecti
gpu-architecture-advisor

Marketplace

Plugin

Repository

Last Verified

Install Skill

Instructions

Validation Details