pdf-split

# PDF Chapter Splitting

Split PDF documents into individual chapter files based on table of contents or text pattern detection.

## Overview

This skill handles PDF splitting when:
- A book or document needs to be divided by chapters
- The PDF has embedded bookmarks/outlines, OR
- Chapter boundaries can be detected from text patterns (e.g., "Chapter 1:", "Part One")

## Prerequisites

Install pypdf via uv inline script dependency:
```python
# /// script
# dependencies = ["pypdf"]
# ///
```

## Workflow

### Phase 1: Analyze PDF Structure

Run `scripts/extract_toc.py` to analyze the PDF:

```bash
uv run ~/.claude/skills/pdf-split/scripts/extract_toc.py <pdf_path>
```

Output includes:
- Total page count
- Embedded bookmarks/outline (if present)
- Detected chapter patterns from text

### Phase 2: Define Chapter Boundaries

Based on Phase 1 output, define chapter boundaries as a list of tuples:
```python
chapters = [
    (start_page, end_page, "chapter_name"),
    # ...
]
```

**If bookmarks exist**: Use bookmark page numbers directly.

**If no bookmarks**:
1. Search for chapter heading patterns in text
2. Verify boundaries by checking page content
3. Present proposed boundaries for user confirmation

### Phase 3: Execute Split

Run `scripts/split_by_chapters.py` with the chapter definitions:

```bash
uv run ~/.claude/skills/pdf-split/scripts/split_by_chapters.py <pdf_path> <output_dir> --chapters '<json_chapters>'
```

Example:
```bash
uv run ~/.claude/skills/pdf-split/scripts/split_by_chapters.py \
  ~/book.pdf \
  ~/book_chapters \
  --chapters '[[1,22,"00_Intro"],[23,45,"01_Chapter1"]]'
```

## Common Chapter Patterns

| Pattern | Regex | Example |
|---------|-------|---------|
| Numbered | `Chapter\s+\d+` | "Chapter 1", "Chapter 12" |
| Part + Chapter | `Part\s+\w+.*Chapter` | "Part One: Chapter 1" |
| Section | `Section\s+\d+` | "Section 1.1" |
| Roman numerals | `Chapter\s+[IVXLC]+` | "Chapter IV" |

## Edge Cases

### Large Chapter Detection (100+ pages)
When
Marketplace

Plugin

Repository

Last Verified

Install Skill

Instructions

Validation Details