Agent skills
gpu-document-processing

Agent skill

gpu-document-processing

Use when processing large PDFs, document collections, or bulk text extraction tasks that benefit from GPU-accelerated processing. Triggers when the user provides large documents or needs bulk document analysis.

View SKILL.md on GitHub Repository

Stars 18,556

Forks 2,584

Install this agent skill to your Project

npx add-skill https://github.com/langchain-ai/deepagents/tree/main/examples/nvidia_deep_agent/skills/gpu-document-processing

SKILL.md

GPU Document Processing Skill

Process large documents and document collections using GPU-accelerated tools. This skill uses the sandbox-as-tool pattern: the agent runs on CPU for reasoning, and sends document processing work to a GPU-equipped environment.

When to Use This Skill

Use this skill when:

Processing large PDF files (50+ pages)
Analyzing collections of documents (10+ files)
Extracting structured data from unstructured documents
Performing bulk text extraction and chunking
Generating embeddings for large document sets
The user uploads or references large documents for analysis

Architecture: Sandbox as Tool

This skill follows the sandbox-as-tool pattern for GPU execution:

Agent reasons on CPU - planning, synthesis, report writing
Processing sent to GPU sandbox - document parsing, embedding, extraction
Results returned to agent - structured output for further analysis

This separation ensures:

API keys stay outside the sandbox (security)
Agent state persists independently of processing jobs
Processing can be parallelized across documents
Cost-efficient: GPU used only during processing, not during reasoning

Capabilities

PDF Text Extraction

Extract text content from PDF documents with layout preservation:

Headers, paragraphs, lists, and tables detected separately
Page numbers and section boundaries preserved
Multi-column layout handling

Tabular Data Extraction

Extract tables from documents into structured formats:

PDF tables to CSV/DataFrames using GPU-accelerated parsing
Automatic column type detection
Handles merged cells and multi-row headers

Document Chunking

Split large documents into meaningful chunks for analysis:

Semantic chunking (by topic/section boundaries)
Fixed-size chunking with overlap for embedding
Configurable chunk sizes (default: 512 tokens)

Embedding Generation

Generate vector embeddings for document chunks:

Uses NVIDIA NeMo Retriever NIM for GPU-accelerated embedding
Supports batch processing for large document sets
Compatible with standard vector stores (Milvus, ChromaDB)

Workflow

Receive document reference from the orchestrator
Determine processing type (extraction, analysis, embedding)
Send to GPU sandbox for processing
Collect structured results (text, tables, embeddings)
Write findings to /shared/ for the orchestrator to synthesize

Processing Large Document Collections

For multiple documents:

Process documents in parallel batches (3-5 concurrent)
Extract key metadata first (title, date, author, page count)
Generate per-document summaries
Cross-reference findings across documents
Write consolidated findings with per-document citations

Output Format

When reporting document processing results:

Include document metadata (filename, pages, size)
Structure extracted content by section/chapter
Format tables as markdown tables
Include page references for all extracted content
Note any extraction quality issues (scanned images, corrupted pages)

Integration with NVIDIA NIM

For production deployments, GPU document processing can leverage:

NVIDIA NeMo Retriever: GPU-accelerated embedding and retrieval
NVIDIA RAPIDS cuDF: Tabular data processing from extracted tables
NVIDIA Triton: Scalable inference for document classification models

See NVIDIA's NIM documentation for self-hosted deployment options.

Maintainer

langchain-ai Core maintainer

Source details

Full Name: langchain-ai/deepagents
Branch: main
Path in repo: examples/nvidia_deep_agent/skills/gpu-document-processing
License: MIT License
Topics: ai langgraph deepagents langchain

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

langchain-ai/deepagents

cuml-machine-learning

Use for GPU-accelerated machine learning on tabular data using NVIDIA cuML. Triggers when tasks involve classification, regression, clustering, dimensionality reduction, or model training on datasets.

18,556 2,584

Explore

langchain-ai/deepagents

cudf-analytics

Use for GPU-accelerated data analysis on datasets, CSVs, or tabular data using NVIDIA cuDF. Triggers when tasks involve groupby aggregations, statistical summaries, anomaly detection, or large-scale data profiling.

18,556 2,584

Explore

langchain-ai/deepagents

data-visualization

Use for creating publication-quality charts and multi-panel analysis summaries. Triggers when tasks involve visualizing data, plotting results, creating charts, or producing visual reports from analysis output.

18,556 2,584

Explore

langchain-ai/deepagents

schema-exploration

Lists tables, describes columns and data types, identifies foreign key relationships, and maps entity relationships in a database. Use when the user asks about database schema, table structure, column types, what tables exist, ERD, foreign keys, or how entities relate.

18,556 2,584

Explore

langchain-ai/deepagents

query-writing

Writes and executes SQL queries from simple SELECTs to complex multi-table JOINs, aggregations, and subqueries. Use when the user asks to query a database, write SQL, run a SELECT statement, retrieve data, filter records, or generate reports from database tables.

18,556 2,584

Explore

langchain-ai/deepagents

social-media

Drafts engaging social media posts, writes hooks, suggests hashtags, creates thread structures, and generates companion images. Use when the user asks to write a LinkedIn post, tweet, Twitter/X thread, social media caption, social post, or repurpose content for social platforms.

18,556 2,584

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

GPU Document Processing Skill

When to Use This Skill

Architecture: Sandbox as Tool

Capabilities

PDF Text Extraction

Tabular Data Extraction

Document Chunking

Embedding Generation

Workflow

Processing Large Document Collections

Output Format

Integration with NVIDIA NIM

Recommended Agent Skills

cuml-machine-learning

cudf-analytics

data-visualization

schema-exploration

query-writing

social-media