PDF manipulation toolkit. Extract text/tables, create PDFs, merge/split, fill forms, for programmatic document processing and analysis.
View on GitHubJanuary 24, 2026
Select agents to install to:
npx add-skill https://github.com/K-Dense-AI/claude-scientific-skills/blob/cd537c1af6731965817eed2ae32b8dd8ea9d0b5e/scientific-skills/document-skills/pdf/SKILL.md -a claude-code --skill pdfInstallation paths:
.claude/skills/pdf/# PDF Processing Guide
## Overview
Extract text/tables, create PDFs, merge/split files, fill forms using Python libraries and command-line tools. Apply this skill for programmatic document processing and analysis. For advanced features or form filling, consult reference.md and forms.md.
## Visual Enhancement with Scientific Schematics
**When creating documents with this skill, always consider adding scientific diagrams and schematics to enhance visual communication.**
If your document does not already contain schematics or diagrams:
- Use the **scientific-schematics** skill to generate AI-powered publication-quality diagrams
- Simply describe your desired diagram in natural language
- Nano Banana Pro will automatically generate, review, and refine the schematic
**For new documents:** Scientific schematics should be generated by default to visually represent key concepts, workflows, architectures, or relationships described in the text.
**How to generate schematics:**
```bash
python scripts/generate_schematic.py "your diagram description" -o figures/output.png
```
The AI will automatically:
- Create publication-quality images with proper formatting
- Review and refine through multiple iterations
- Ensure accessibility (colorblind-friendly, high contrast)
- Save outputs in the figures/ directory
**When to add schematics:**
- PDF processing workflow diagrams
- Document manipulation flowcharts
- Form processing visualizations
- Data extraction pipeline diagrams
- Any complex concept that benefits from visualization
For detailed guidance on creating schematics, refer to the scientific-schematics skill documentation.
---
## Quick Start
```python
from pypdf import PdfReader, PdfWriter
# Read a PDF
reader = PdfReader("document.pdf")
print(f"Pages: {len(reader.pages)}")
# Extract text
text = ""
for page in reader.pages:
text += page.extract_text()
```
## Python Libraries
### pypdf - Basic Operations
#### Merge PDFs
```python
from pypdf import PdfWriter,Issues Found: