Agent skill
consume-content
Produce a faithful content-snapshot of any source material (article, report, PDF, advisory) with verbatim quotes, structural transparency, and labeled editorial.
Install this agent skill to your Project
npx add-skill https://github.com/juanandresgs/claude-ctrl/tree/main/skills/consume-content
SKILL.md
Content Snapshot Skill
Produce a structured, verifiable content-snapshot of source material — articles, reports, PDFs, advisories, papers, transcripts. The output follows the Content Snapshot v0.2 template (see template.md in this directory) with verbatim quotes, structural transparency, and labeled editorial.
Why this exists: Narrated summaries of source material risk hallucinations and misrepresentation. Content-snapshots solve this by making every claim traceable to a verbatim quote with a page/section reference. The analyst's interpretation is always labeled and separated.
Known Limitation: The Verbatim Problem
Claude doesn't copy-paste. It reads text into context, then writes from context. Every "quote" is a reconstruction from the context window, not a mechanical transfer. This means subtle word substitutions, dropped articles, or reordering can occur even with good intent. A content-snapshot that silently mangles quotes is worse than a summary — it lies about its own fidelity.
Mitigation: This skill uses a Read-Write-Verify pipeline (Phases 3 and 5) designed to minimize reconstruction error. Phase 5 verification is MANDATORY, not optional.
What this can't guarantee:
- PDF text extraction isn't perfect — OCR artifacts, ligatures, encoding issues can cause mismatches even when the quote is faithfully reproduced
- Very long quotes (4+ sentences) have higher reconstruction error risk than short ones
- If the user needs guaranteed verbatim fidelity for legal/compliance use, they should verify against the original document
Phase 1: Detect Input & Ingest
Parse $ARGUMENTS to determine input type and ingest the source material.
Input Detection
| Pattern | Type | Tool |
|---|---|---|
Starts with http:// or https:// |
URL | WebFetch |
Ends with .pdf |
PDF file | Read with pages parameter |
| Existing file path | Local file | Read |
| None of the above | Ask user | AskUserQuestion |
Ingestion Rules
URLs:
- Use
WebFetchto retrieve content - If fetch fails (paywall, 403, redirect loop): report failure, ask user for a local copy or pasted content
- Store the fetched content for subsequent passes
PDFs:
- Read with
pagesparameter in 20-page batches:pages: "1-20",pages: "21-40", etc. - Continue until all pages are read
- Note total page count for structure mapping
Local files:
- Read directly with
Readtool - For very large files (>2000 lines), read in segments
Output Setup
Create the output directory:
.claude/snapshots/{slugified-title}_{YYYY-MM-DD}/
Slugify the title: lowercase, replace spaces with hyphens, remove special characters, truncate to 50 chars.
Write a preliminary snapshot.md with a header:
# Content Snapshot: [Title]
**Source:** [URL or file path]
**Author:** [if known]
**Date:** [publication date if known]
**Snapshot Date:** [today's date]
**Template:** Content Snapshot v0.2
---
Phase 2: Structure Discovery (First Pass)
Read through the entire source to build a structural understanding. Do NOT extract quotes yet.
Identify:
- Title, author(s), publication date
- Total length (pages or word count estimate)
- Document structure: chapters, sections, headings
- Whether the source has its own summary/abstract/key findings
- The source's conclusion/recommendations section (for Section 5)
Write to snapshot.md:
Section 2 — Document Structure:
- Chapter/section outline with 1-2 sentence neutral descriptions
- Include page ranges (PDFs) or section identifiers (web content)
If the source has no discernible structure (e.g., a short blog post), note this and adapt: treat the entire piece as a single section.
Phase 3: Content Extraction (Section-at-a-Time)
This is the high-fidelity extraction pass. The key constraint: read one section, write its quotes, then move to the next. Do not accumulate multiple sections in context before writing.
For each section in the Document Structure:
- Re-read just that section from the source (use page ranges for PDFs, section offsets for files)
- Immediately extract 1-3 verbatim quotes — prefer shorter quotes (1-3 sentences) where accuracy is easier to maintain
- Write the quotes to
snapshot.mdbefore reading the next section — this is critical for fidelity - Tag each quote with a relevance label (2-3 words)
- Format each quote as:
#### [Section Title] — [pages/location]
**[Relevance Tag]**
> "[Verbatim quote from source]"
> <cite>[Page X / Section Y]</cite>
Section 1 — Key Findings (Author Summary):
- If the source has its own summary/abstract/executive summary: extract verbatim bullets with page references
- If not: write "No author summary present in source" and skip
Section 4 — Selected Content:
- Add a selection criteria note at the top explaining what was prioritized
- Process each section from the Document Structure in order
Section 5 — Why This Matters (Author Conclusions):
- Extract direct quotes from the source's conclusion/implications/recommendations
- No paraphrase — only verbatim quotes
Coverage Tracking:
As you process each section, track coverage for Section 3:
- Which sections received quotes (covered)
- Which sections were omitted and why
After all sections are processed, write Section 3 — Coverage Map as a table:
| Section | Covered? | Quotes | Notes |
|---------|----------|--------|-------|
| [name] | Yes | 2 | |
| [name] | No | 0 | Background only, no novel findings |
Every section from the Document Structure must appear in this table.
Phase 4: Editorial & Self-Audit
Section 6 — Analyst Assessment:
Write a brief synthesis (max 250 words). Rules:
- Open with: "The following is editorial analysis, not source material."
- Distinguish inference from report content
- No new facts not grounded in the source
- Count your words — stay under 250
Section 7 — Representation Assessment:
Write 4-6 bullets evaluating:
- How representative this snapshot is of the full source
- What perspectives or topics are emphasized/underrepresented
- The source's own perspective, bias, or institutional position
- Any significant content that was omitted and why
Phase 5: Verify ALL Quotes & Deliver
This phase is MANDATORY. Do not skip it.
Quote Verification
For EVERY blockquote in snapshot.md:
- Re-read the cited page/section from the source — use the page/section reference in the
<cite>tag - Search for the quoted text in the re-read content using
Grepif the source is a local file - Compare the quote against the source:
- If found verbatim: mark as verified
- If found with differences: note the specific differences, correct the quote in
snapshot.mdusingEdit - If not found at all: flag as potential fabrication — remove the quote and attempt to re-extract from the source, or remove entirely with a note
Structural Checks
- Every blockquote has a
<cite>with page/section reference - Coverage Map accounts for every section in the Document Structure
- Analyst Assessment is under 250 words (count them)
- No text outside blockquotes presents itself as source material
- Sections appear in template order (1-7)
Verification Log
Append to the end of snapshot.md:
---
## Verification Log
- **Quotes verified:** [N]
- **Corrections made:** [M]
- **Quotes removed (unverifiable):** [K]
- **Coverage map complete:** Yes/No
- **Analyst Assessment word count:** [W]/250
Delivery
- Report the output path to the user:
.claude/snapshots/{slug}_{date}/snapshot.md - If any quote could not be verified and was not removed, do NOT deliver — ask user for guidance
- If all quotes verified (with or without corrections), deliver the snapshot
Enforcement Rules
These constraints make the skill reliable. They are non-negotiable:
- Quotes are verbatim — never paraphrased. If uncertain about exact wording, re-read the source.
- Every quote includes page/section reference in a
<cite>tag. - No invented data. If a fact isn't in the source, it doesn't appear outside the Analyst Assessment.
- Analyst Assessment clearly labeled as editorial — opens with italic disclaimer.
- Remove UI artifacts — file paths, pagination chrome, dashboard headers, navigation elements.
- If source lacks an author summary, note this explicitly. Do NOT fabricate one.
- Coverage Map must account for every section in the Document Structure — nothing silently omitted.
- Representation Assessment must note the perspective/bias of the source itself.
- Phase 5 verification is mandatory — never skip it, never treat it as optional.
- Read-then-write, not accumulate-then-write — extract quotes immediately after reading each section.
Edge Cases
| Scenario | Handling |
|---|---|
| No abstract/summary in source | Skip Section 1, note: "No author summary present" |
| Very short source (<2 pages) | Collapse sections, quote most content directly |
| Very long source (>100 pages) | Batch reads in 20-page chunks, be selective, document omissions thoroughly in Coverage Map |
| URL behind paywall/403 | Report failure, ask user for local copy or paste |
| Multiple sources provided | Ask user: separate snapshots or combined? |
| Non-English source | Quote in original language, note language in header |
| Source is a thread/chat/transcript | Adapt structure to chronological, quote key exchanges |
| PDF with OCR artifacts | Note in Verification Log that text extraction quality may affect quote fidelity |
| Source has no clear sections | Treat as single section, extract quotes by topic clusters |
Write Context Summary (MANDATORY — do this LAST)
Write a compact result summary so the parent session receives key findings:
cat > .claude/.skill-result.md << 'SKILLEOF'
## Content Snapshot Result: [Title]
**Source:** [URL or document path]
**Output:** [path to generated snapshot file]
**Quotes extracted:** [n]
### Key Takeaways
1. [Most important insight]
2. [Second key insight]
3. [Third key insight]
### Coverage
- [What was covered well]
- [Any gaps or sections skipped]
SKILLEOF
Keep under 2000 characters. This is consumed by a hook — the parent session will see it automatically.
After Completion
---
Content Snapshot complete.
- Source: [title or URL]
- Quotes: [N] verified, [M] corrected, [K] removed
- Output: .claude/snapshots/{slug}_{date}/snapshot.md
Want me to snapshot another source, or integrate this into a project?
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
rewind
List and restore to a named checkpoint from the current session. Use when the agent has gone off track and you need to recover to a previous good state.
deep-research
Multi-model deep research with comparative assessment (OpenAI + Perplexity + Gemini). Queries 3 deep research providers in parallel and produces a comparative synthesis.
context-preservation
Generate structured context summaries for session continuity across compaction
reckoning
Analyze a project's MASTER_PLAN.md to assess coherence, evolution trajectory, and intent alignment. Modes: default (full analysis), compare (delta between reckonings), operationalize (convert findings to actionable work via /decide), steer (strategic brainstorming grounded in findings).
decide
Generate an interactive decision configurator from research or plan analysis. Presents options as explorable cards with trade-offs, costs, and filtering. Integrates with Planner to collect DEC-ID decisions.
diagnose
Validate hook integrity, state file consistency, and system health for the ~/.claude configuration.
Didn't find tool you were looking for?