Back to Skills

xarray-for-multidimensional-data

verified

Work with labeled multidimensional arrays for scientific data analysis using Xarray. Use when handling climate data, satellite imagery, oceanographic data, or any multidimensional datasets with coordinates and metadata. Ideal for NetCDF/HDF5 files, time series analysis, and large datasets requiring lazy loading with Dask.

View on GitHub

Marketplace

rse-plugins

uw-ssec/rse-plugins

Plugin

scientific-domain-applications

data-science

Repository

uw-ssec/rse-plugins
10stars

plugins/scientific-domain-applications/skills/xarray-for-multidimensional-data/SKILL.md

Last Verified

January 22, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/uw-ssec/rse-plugins/blob/main/plugins/scientific-domain-applications/skills/xarray-for-multidimensional-data/SKILL.md -a claude-code --skill xarray-for-multidimensional-data

Installation paths:

Claude
.claude/skills/xarray-for-multidimensional-data/
Powered by add-skill CLI

Instructions

# Xarray for Multidimensional Data

Master **Xarray**, the powerful library for working with labeled multidimensional arrays in scientific Python. Learn how to efficiently handle complex datasets with multiple dimensions, coordinates, and metadata - from climate data and satellite imagery to experimental measurements and simulations.

**Official Documentation**: https://docs.xarray.dev/

**GitHub**: https://github.com/pydata/xarray

## Quick Reference Card

### Installation & Setup
```bash
# Using pixi (recommended for scientific projects)
pixi add xarray netcdf4 dask

# Using pip
pip install xarray[complete]

# Optional dependencies for specific formats
pixi add zarr h5netcdf scipy bottleneck

# Geospatial extensions (for raster data, CRS handling, reprojection)
pixi add rioxarray xesmf

# DataTree is built into Xarray (no separate installation needed)
```

### Essential Xarray Concepts
```python
import xarray as xr
import numpy as np

# DataArray: Single labeled array
temperature = xr.DataArray(
    data=np.random.randn(3, 4),
    dims=["time", "location"],
    coords={
        "time": ["2024-01-01", "2024-01-02", "2024-01-03"],
        "location": ["A", "B", "C", "D"]
    },
    name="temperature"
)

# Dataset: Collection of DataArrays
ds = xr.Dataset({
    "temperature": temperature,
    "pressure": (["time", "location"], np.random.randn(3, 4))
})
```

### Essential Operations
```python
# Selection by label
ds.sel(time="2024-01-01")
ds.sel(location="A")

# Selection by index
ds.isel(time=0)

# Slicing
ds.sel(time=slice("2024-01-01", "2024-01-02"))

# Aggregation
ds.mean(dim="time")
ds.sum(dim="location")

# Computation
ds["temperature"] + 273.15  # Celsius to Kelvin
ds.groupby("time.month").mean()

# I/O operations
ds.to_netcdf("data.nc")
ds = xr.open_dataset("data.nc")
```

### Quick Decision Tree

```
Working with multidimensional scientific data?
├─ YES → Use Xarray for labeled dimensions
└─ NO → NumPy/Pandas sufficient

Need to track coordinates and metadata

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
14271 chars