Agent skills
ocrmypdf-optimize

Agent skill

ocrmypdf-optimize

OCRmyPDF optimization skill — compress PDFs, configure PDF/A output, JBIG2 encoding, and lossless optimization. Use when the user needs to reduce PDF file size, create archival PDF/A files, or optimize OCR output.

View SKILL.md on GitHub Repository

Stars 254

Forks 41

Install this agent skill to your Project

npx add-skill https://github.com/partme-ai/full-stack-skills/tree/main/skills/ocrmypdf-skills/ocrmypdf-optimize

SKILL.md

OCRmyPDF — Optimization Guide

Overview

OCRmyPDF provides extensive optimization options to reduce file size, create PDF/A archival documents, and configure output quality.

For core OCR functionality, see the ocrmypdf skill. For image processing (deskew, rotate, clean), see ocrmypdf-image. For batch/Docker/scripting, see ocrmypdf-batch.

Compression Levels

bash

# Level 0 — no optimization (fastest)
ocrmypdf --optimize 0 input.pdf output.pdf

# Level 1 — lossless (default)
ocrmypdf --optimize 1 input.pdf output.pdf

# Level 2 — lossy (aggressive)
ocrmypdf --optimize 2 input.pdf output.pdf

# Level 3 — lossless, aggressive JPEG recompression
ocrmypdf --optimize 3 input.pdf output.pdf

PDF/A Output

PDF/A is an archival format with embedded fonts and colorspaces:

bash

# PDF/A-1b (basic, default)
ocrmypdf --output-type pdfa input.pdf output.pdf

# PDF/A-2b (includes transparency)
ocrmypdf --output-type pdfa2b input.pdf output.pdf

# PDF/A-2u (Unicode)
ocrmypdf --output-type pdfa2u input.pdf output.pdf

# Standard PDF (no archival)
ocrmypdf --output-type pdf input.pdf output.pdf

JBIG2 Encoding

JBIG2 provides excellent compression for monochrome (1-bit) images:

bash

# Enable JBIG2 (requires jbig2enc)
ocrmypdf --jbig2-lossy input.pdf output.pdf  # Lossy

ocrmypdf --jbib2-lossless input.pdf output.pdf  # Lossless (v17+)

Requirements:

bash

# Debian/Ubuntu
apt install jbig2enc

# macOS
brew install jbig2enc

PNG Optimization

Optimize embedded PNG images:

bash

# Use pngquant for lossy compression
ocrmypdf --png-lossy input.pdf output.pdf

# Lossless PNG optimization
ocrmypdf --png-lossless input.pdf output.pdf

Ghostscript Options

Fine-tune PDF processing with Ghostscript:

bash

# Set PDF minor version
ocrmypdf --pdf-renderer hatch input.pdf output.pdf

# Use pdfimages for better image extraction
ocrmypdf --pdf-renderer img2pdf input.pdf output.pdf

Sidecar Text

Generate text file alongside PDF without modifying PDF:

bash

# Generate sidecar only
ocrmypdf --output-type none --sidecar text.txt input.pdf output.pdf

# Typical sidecar workflow
ocrmypdf --sidecar text.txt --force-ocr input.pdf output.pdf

Combined Recipes

Maximum compression

bash

ocrmypdf --optimize 3 --jbig2-lossy --png-lossy input.pdf small.pdf

Archival PDF/A with compression

bash

ocrmypdf --output-type pdfa --optimize 2 input.pdf archival.pdf

Lossless output

bash

ocrmypdf --output-type pdf --optimize 1 --png-lossless input.pdf lossless.pdf

Quick Reference

Task	Command
No optimization	`--optimize 0`
Lossless default	`--optimize 1`
Aggressive lossy	`--optimize 2`
Max quality	`--optimize 3`
PDF/A-1b (default)	`--output-type pdfa`
PDF/A-2b	`--output-type pdfa2b`
JBIG2 lossy	`--jbig2-lossy`
PNG lossy	`--png-lossy`
Sidecar text	`--sidecar text.txt`

Troubleshooting

Large file size: Try --optimize 2 or --png-lossy.
PDF/A validation fails: Use --output-type pdfa2b for better compatibility.
Font issues: PDF/A-2u ensures full Unicode support.

Maintainer

partme-ai Core maintainer

Source details

Full Name: partme-ai/full-stack-skills
Branch: main
Path in repo: skills/ocrmypdf-skills/ocrmypdf-optimize
License: Other
Topics: claude-code agent-skills cursor skills codebuddy qoder

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

partme-ai/full-stack-skills

ocrmypdf-batch

OCRmyPDF batch processing skill — process multiple PDFs, Docker automation, shell scripting, and CI/CD integration. Use when the user needs to OCR many PDFs, set up automated OCR pipelines, or integrate OCR into workflows.

254 41

Explore

partme-ai/full-stack-skills

ocrmypdf-image

OCRmyPDF image processing skill — deskew, rotate, clean, despeckle, remove border from scanned documents. Use when the user needs to improve scanned PDF quality, fix skewed pages, remove noise, or clean up scanned documents before OCR.

254 41

Explore

partme-ai/full-stack-skills

ocrmypdf-api

OCRmyPDF Python API and plugin skill — use OCRmyPDF programmatically from Python, integrate with applications, and extend with plugins (EasyOCR, PaddleOCR, AppleOCR). Use when the user needs to call OCRmyPDF from Python code, build OCR pipelines, or use alternative OCR engines.

254 41

Explore

partme-ai/full-stack-skills

ocrmypdf

OCRmyPDF core skill — add searchable OCR text layer to scanned PDFs, convert images to searchable PDFs, support 100+ languages via Tesseract. Use when the user needs to OCR a PDF, make a scanned PDF searchable, or extract text from scanned documents.

254 41

Explore

partme-ai/full-stack-skills

svelte

Guides Svelte and SvelteKit development including reactive components, stores, transitions, lifecycle hooks, SSR, file-based routing, and deployment. Use when the user needs to build Svelte components, create SvelteKit applications, implement reactivity patterns, or configure Svelte with Vite.

254 41

Explore

partme-ai/full-stack-skills

tui-empty

Generate and render a pixel-precise ASCII TUI Empty State component with complete output blocks (TUI_RENDER, COMPONENT_SPEC, PENCIL_SPEC, PENCIL_BATCH_DESIGN) for Pencil MCP drawing workflows. Use when the user asks to create an empty state in a terminal UI, text-based interface, or Pencil MCP project.

254 41

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

OCRmyPDF — Optimization Guide

Overview

Compression Levels

PDF/A Output

JBIG2 Encoding

PNG Optimization

Ghostscript Options

Sidecar Text

Combined Recipes

Maximum compression

Archival PDF/A with compression

Lossless output

Quick Reference

Troubleshooting

Recommended Agent Skills

ocrmypdf-batch

ocrmypdf-image

ocrmypdf-api

ocrmypdf

svelte

tui-empty