This skill should be used when the user asks to "work with polars", "create a dataframe", "use lazy evaluation", "migrate from pandas", "optimize data pipelines", "read parquet files", "group by operations", or needs guidance on Polars DataFrame operations, expression API, performance optimization, or data transformation workflows.
View on GitHubSelect agents to install to:
npx add-skill https://github.com/tbhb/oaps/blob/main/skills/python-polars/SKILL.md -a claude-code --skill python-polarsInstallation paths:
.claude/skills/python-polars/# Python Polars
Polars is a lightning-fast DataFrame library for Python built on Apache Arrow. It provides an expression-based API, lazy evaluation framework, and automatic parallelization for high-performance data processing.
## Quick start
### Installation
```python
uv pip install polars
```
### Basic operations
```python
import polars as pl
# Create DataFrame
df = pl.DataFrame({
"name": ["Alice", "Bob", "Charlie"],
"age": [25, 30, 35],
"city": ["NY", "LA", "SF"]
})
# Select columns
df.select("name", "age")
# Filter rows
df.filter(pl.col("age") > 25)
# Add computed columns
df.with_columns(
age_plus_10=pl.col("age") + 10
)
```
## Core concepts
### Expressions
Expressions are composable units describing data transformations. Use `pl.col("column_name")` to reference columns and chain methods for complex operations:
```python
df.select(
pl.col("name"),
(pl.col("age") * 12).alias("age_in_months")
)
```
Expressions execute within contexts: `select()`, `with_columns()`, `filter()`, `group_by().agg()`.
### Lazy vs eager evaluation
**Eager (DataFrame)**: Operations execute immediately.
```python
df = pl.read_csv("file.csv") # Reads immediately
result = df.filter(pl.col("age") > 25)
```
**Lazy (LazyFrame)**: Operations build an optimized query plan.
```python
lf = pl.scan_csv("file.csv") # Doesn't read yet
result = lf.filter(pl.col("age") > 25).select("name", "age")
df = result.collect() # Executes optimized query
```
Use lazy mode for large datasets, complex pipelines, and when performance is critical. Benefits include automatic query optimization, predicate pushdown, projection pushdown, and parallel execution.
For detailed concepts including data types, type casting, null handling, and parallelization, see `references/core-concepts.md`.
## Common operations
### Select and with_columns
```python
# Select specific columns
df.select("name", "age")
# Select with expressions
df.select(
pl.col("name"),
(pl.col("age")