Back to Skills

python-polars

verified

This skill should be used when the user asks to "work with polars", "create a dataframe", "use lazy evaluation", "migrate from pandas", "optimize data pipelines", "read parquet files", "group by operations", or needs guidance on Polars DataFrame operations, expression API, performance optimization, or data transformation workflows.

View on GitHub

Marketplace

oaps

tbhb/oaps

Plugin

oaps

development

Repository

tbhb/oaps
1stars

skills/python-polars/SKILL.md

Last Verified

January 20, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/tbhb/oaps/blob/main/skills/python-polars/SKILL.md -a claude-code --skill python-polars

Installation paths:

Claude
.claude/skills/python-polars/
Powered by add-skill CLI

Instructions

# Python Polars

Polars is a lightning-fast DataFrame library for Python built on Apache Arrow. It provides an expression-based API, lazy evaluation framework, and automatic parallelization for high-performance data processing.

## Quick start

### Installation

```python
uv pip install polars
```

### Basic operations

```python
import polars as pl

# Create DataFrame
df = pl.DataFrame({
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 35],
    "city": ["NY", "LA", "SF"]
})

# Select columns
df.select("name", "age")

# Filter rows
df.filter(pl.col("age") > 25)

# Add computed columns
df.with_columns(
    age_plus_10=pl.col("age") + 10
)
```

## Core concepts

### Expressions

Expressions are composable units describing data transformations. Use `pl.col("column_name")` to reference columns and chain methods for complex operations:

```python
df.select(
    pl.col("name"),
    (pl.col("age") * 12).alias("age_in_months")
)
```

Expressions execute within contexts: `select()`, `with_columns()`, `filter()`, `group_by().agg()`.

### Lazy vs eager evaluation

**Eager (DataFrame)**: Operations execute immediately.

```python
df = pl.read_csv("file.csv")  # Reads immediately
result = df.filter(pl.col("age") > 25)
```

**Lazy (LazyFrame)**: Operations build an optimized query plan.

```python
lf = pl.scan_csv("file.csv")  # Doesn't read yet
result = lf.filter(pl.col("age") > 25).select("name", "age")
df = result.collect()  # Executes optimized query
```

Use lazy mode for large datasets, complex pipelines, and when performance is critical. Benefits include automatic query optimization, predicate pushdown, projection pushdown, and parallel execution.

For detailed concepts including data types, type casting, null handling, and parallelization, see `references/core-concepts.md`.

## Common operations

### Select and with_columns

```python
# Select specific columns
df.select("name", "age")

# Select with expressions
df.select(
    pl.col("name"),
    (pl.col("age")

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
6796 chars