Agent skill

read-bin-docs

Straightforward text extraction from document files (text-based PDF only for now, no OCR or docx). Use when you just need to read/extract text from binary documents.

Stars 23
Forks 3

Install this agent skill to your Project

npx add-skill https://github.com/YPares/agent-skills/tree/main/read-bin-docs

SKILL.md

Doc Formats

Quick Start: Extract Text from PDF

Need to extract text from a PDF? Use this Python snippet:

python
from pypdf import PdfReader

reader = PdfReader("document.pdf")
text = "".join(page.extract_text() for page in reader.pages)
print(text)

Or from the command line:

bash
uvx --with pypdf python /path/to/extract_pdf_text.py document.pdf

PDF Text Extraction

Basic Usage

python
from pypdf import PdfReader

# Read all pages
reader = PdfReader("file.pdf")
for page in reader.pages:
    text = page.extract_text()
    print(text)

Extract Specific Pages

python
from pypdf import PdfReader

reader = PdfReader("file.pdf")
# Get pages 1-5 (0-indexed)
for page in reader.pages[0:5]:
    print(page.extract_text())

Using the Script

This skill includes scripts/extract_pdf_text.py for command-line extraction:

bash
# Extract all pages to stdout
python extract_pdf_text.py document.pdf

# Extract to file
python extract_pdf_text.py document.pdf --output text.txt

# Extract specific pages
python extract_pdf_text.py document.pdf --pages 1-5
python extract_pdf_text.py document.pdf --pages 1,3,5

Requirements

  • pypdf: uvx --with pypdf python <script>
  • Works with most text-based PDFs
  • Scanned PDFs without OCR won't extract text

Common Issues

"No text extracted": The PDF may be scanned (image-based) without OCR. OCR support requires additional tools.

"Encoding errors": pypdf handles most encodings, but some PDFs may have encoding issues. Use page.extract_text(layout=True) for layout-aware extraction if available.


Future: Support for DOCX, XLSX, and other formats coming soon.

Expand your agent's capabilities with these related and highly-rated skills.

YPares/agent-skills

nix-profile-manager

Expert guidance for agents to manage local Nix profiles for installing tools and dependencies. Covers flakes, profile management, package searching, and registry configuration.

23 3
Explore
YPares/agent-skills

github-pr-workflow

Working with GitHub Pull Requests using the gh CLI. Use for fetching PR details, review comments, CI status, and understanding the difference between PR-level comments vs inline code review comments.

23 3
Explore
YPares/agent-skills

working-with-jj

Expert guidance for using JJ (Jujutsu) version control system. Use when working with JJ, whatever the subject. Operations, revsets, templates, debugging change evolution, etc. Covers JJ commands, template system, evolog, operations log, and interoperability with git remotes.

23 3
Explore
YPares/agent-skills

typst-writer

Write correct and idiomatic Typst code for document typesetting. Use when creating or editing Typst (.typ) files, working with Typst markup, or answering questions about Typst syntax and features. Focuses on avoiding common syntax confusion (arrays vs content blocks, proper function definitions, state management).

23 3
Explore
YPares/agent-skills

nushell-plugin-builder

Guide for creating Nushell plugins in Rust using nu_plugin and nu_protocol crates. Use when users want to build custom Nushell commands, extend Nushell with new functionality, create data transformations, or integrate external tools/APIs into Nushell. Covers project setup, command implementation, streaming data, custom values, and testing.

23 3
Explore
YPares/agent-skills

textual-builder

Build Text User Interface (TUI) applications using the Textual Python framework (v0.86.0+). Use when creating terminal-based applications, prototyping card games or interactive CLIs, or when the user mentions Textual, TUI, or terminal UI. Includes comprehensive reference documentation, card game starter template, and styling guides.

23 3
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results