Agent skill
blog-factcheck
Verify statistics and claims in blog posts by fetching cited source URLs and checking if the claimed data actually appears on the page. Extracts all statistical claims (numbers, percentages, named sources), fetches each cited URL via WebFetch, and scores match confidence (exact match 1.0, paraphrase 0.7-0.9, not found 0.0). Flags uncited claims as UNVERIFIED. Use when user says "fact check", "verify statistics", "check sources", "validate claims", "factcheck", "source verification".
Install this agent skill to your Project
npx add-skill https://github.com/AgriciDaniel/claude-blog/tree/main/skills/blog-factcheck
SKILL.md
Blog Fact-Check
Verify statistics, claims, and source attributions in blog posts. Pure Claude pipeline with no external NLP dependencies.
Workflow
Step 1: Read the Blog Post
Read the target file and identify all sections containing data claims.
Step 2: Extract Statistical Claims
Scan the full text for every claim that includes a number, percentage, dollar amount, or named source. Build a claims list with these fields:
| Field | Description |
|---|---|
| claim_text | The exact sentence or phrase containing the statistic |
| value | The numeric value (e.g., "42%", "$1.2M", "3x") |
| attribution | Named source if present (e.g., "HubSpot", "Gartner 2025") |
| url | Cited URL if present (from markdown link or parenthetical) |
| location | Heading or line number where the claim appears |
Step 3: Verify Cited Claims
For each claim that includes a URL:
- Fetch the source page via WebFetch
- Search the returned content for the specific numeric value
- If exact value found, check surrounding context matches the claim topic
- Assign a confidence score (see Verification Scoring below)
Process claims sequentially to avoid rate-limiting source sites.
Step 4: Flag Uncited Claims
For claims without a URL:
- Mark status as UNVERIFIED
- Suggest a search query the user can run to find a source
- If the attribution names a specific organization, suggest their domain
Step 5: Generate Verification Report
Output the full results table, summary statistics, and recommended actions.
Claim Extraction Patterns
Identify claims matching these structures:
Fully cited (highest priority):
[Number]% [claim] ([Source], [Year])- parenthetical citation[claim] [Number]% ... [markdown link to source]- inline linkAccording to [Source], [Number]...- attribution lead
Uncited statistics (flag for sourcing):
[Number]% of [noun phrase]- standalone percentage[Number]x more/less/higher/lower- multiplier claims$[Number] [claim]- dollar figures without attribution
Weak signals (check context before extracting):
studies show,research indicates,data suggests+ nearby numbersurvey found,report reveals,analysis shows+ nearby number- Round numbers in isolation (e.g., "millions of users") - skip unless specific
Verification Scoring
| Score | Status | Criteria |
|---|---|---|
| 1.0 | VERIFIED | Exact number found on cited page in matching context |
| 0.7-0.9 | PARAPHRASE | Similar data found but with different wording, rounding, or timeframe |
| 0.3-0.6 | WEAK | Source page exists and covers the topic but the specific statistic is not visible |
| 0.0 | NOT FOUND | Cited page does not contain the claimed data anywhere |
| N/A | UNVERIFIED | No source URL provided for the claim |
Scoring guidance:
- A claim of "43%" when the source says "nearly half" scores 0.8
- A claim of "2024" data when the source only has "2023" scores 0.7
- A claim citing a homepage when the stat lives on a subpage scores 0.3
- A 404 or unreachable URL scores 0.0
Output Format
Verification Report: [Post Title]
File: [path] Claims found: [total] Verified: [count] | Paraphrase: [count] | Weak: [count] | Not Found: [count] | Unverified: [count]
| # | Claim | Source URL | Score | Status | Notes |
|---|---|---|---|---|---|
| 1 | "73% of marketers..." | https://example.com/report | 1.0 | VERIFIED | Exact match found in section 3 |
| 2 | "5x ROI improvement" | https://example.com/study | 0.8 | PARAPHRASE | Source says "nearly 5x" |
| 3 | "60% prefer video" | (none) | N/A | UNVERIFIED | Try: "video preference statistics 2025" |
Recommended Actions
- [List claims that need source URLs]
- [List claims with weak or not-found scores that need replacement sources]
- [List claims where the source data may be outdated]
Integration
This skill can be called from blog-analyze as an optional deep-verification step.
When invoked from the analyzer, only claims scoring below 0.7 are flagged in the
analysis report.
Standalone usage: /blog factcheck path/to/post.md
Limitations
- Paywalled content: WebFetch cannot access content behind login walls. These score as WEAK (0.5) with a note about paywall detection.
- Dynamic pages: JavaScript-rendered content may not be available via WebFetch. If the page returns minimal content, note this in the status.
- PDF sources: WebFetch may not extract PDF text reliably. Flag PDF URLs for manual verification.
- Archived pages: If a URL returns 404, suggest checking web.archive.org.
- Rate limits: Process no more than 10 URLs per run to avoid overwhelming source servers. If a post has more than 10 cited URLs, verify the first 10 and list the remainder as SKIPPED.
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
blog-brief
Generate detailed content briefs for blog posts with target keywords, content outlines, competitive analysis, recommended statistics, image and chart suggestions, word count targets, internal linking architecture, template recommendations (12 types), TL;DR drafts, citation capsule planning, information gain prompts, and multi-channel distribution plans. Briefs are optimized for Google rankings and AI citations (GEO/AEO). Use when user says "content brief", "blog brief", "write brief", "outline blog", "plan blog post", "blog outline", "content outline".
blog
Full-lifecycle blog engine with 21 commands, 12 content templates, 5-category 100-point scoring, and 4 specialized agents. Optimized for Google rankings (December 2025 Core Update, E-E-A-T) and AI citations (GEO/AEO). Writes, rewrites, analyzes, outlines, audits, and repurposes blog content with answer-first formatting, sourced statistics, Pixabay/Unsplash/Pexels images, AI image generation via Gemini, built-in SVG chart generation, JSON-LD schema generation, and freshness signals. Supports any platform (WordPress, Next.js MDX, Hugo, Ghost, Astro, Jekyll, 11ty, Gatsby, HTML). Use when user says "blog", "write blog", "blog post", "blog strategy", "content brief", "editorial calendar", "analyze blog", "rewrite blog", "update blog", "blog SEO", "blog optimization", "content plan", "blog outline", "seo check", "schema markup", "repurpose", "geo audit", "blog audit", "citation readiness".
blog-persona
Create and manage writing personas with NNGroup 4-dimension tone framework (Funny-Serious, Formal-Casual, Respectful-Irreverent, Enthusiastic-Matter-of-fact). Personas define readability targets, sentence length distribution, vocabulary tier, contraction frequency, and summary box label. Used by blog-write and blog-rewrite to enforce consistent voice. Use when user says "persona", "voice", "tone", "writing style", "brand voice", "create persona", "use persona".
blog-analyze
Audit and score blog posts on a 5-category 100-point scoring system covering content quality, SEO optimization, E-E-A-T signals, technical elements, and AI citation readiness. Includes AI content detection (burstiness, phrase flagging, vocabulary diversity). Supports export formats (markdown, JSON, table) and batch analysis with sorting. Generates prioritized recommendations (Critical/High/Medium/Low) with specific fixes. Works with any format (MDX, markdown, HTML, URL). Use when user says "analyze blog", "audit blog", "blog score", "check blog quality", "blog review", "rate this blog", "blog health check".
blog-taxonomy
Extract, suggest, and sync tags and categories for blog posts across all major CMS platforms. Supports WordPress REST API, Shopify GraphQL, Ghost Content API, Strapi REST/GraphQL, and Sanity GROQ. Generates tag suggestions from content analysis (keyword frequency, heading extraction, semantic grouping), enforces minimum post-count thresholds to prevent thin tag archives, and syncs taxonomy via authenticated API calls. Use when user says "tags", "categories", "taxonomy", "tag suggestions", "sync tags", "WordPress tags", "Shopify tags".
blog-audit
Full-site blog health assessment scanning all blog files for quality scores, orphan pages, topic cannibalization, stale content, and AI citation readiness. Spawns parallel subagents for comprehensive analysis. Produces per-post scores and a prioritized action queue. Use when user says "audit blog", "blog audit", "site audit", "blog health", "audit all posts", "check all blogs".
Didn't find tool you were looking for?