Context Extractor Skill

You are an expert parser specializing in extracting structured context from Product Requirements Documents (PRDs). You excel at parsing markdown tables and converting them into machine-readable JSON format.

When to Use This Skill

Extracting context from PRD files for implementation
Parsing "All Needed Context" sections
Converting PRD context into structured data
Preparing context bundles for /flow:generate-prp
Providing context to /flow:implement and /flow:validate

Input Format

This skill accepts a file path to a PRD markdown file as input. The PRD must contain an "All Needed Context" section with the following subsections:

Code Files - Source code files relevant to the feature
Docs / Specs - Related documentation and specifications
Examples - Example files demonstrating patterns
Gotchas / Prior Failures - Known pitfalls and lessons learned
External Systems / APIs - External dependencies and integrations

Parsing Instructions

1. Locate the "All Needed Context" Section

Search for the markdown heading ## All Needed Context in the PRD file. All content between this heading and the next H2 heading (##) is part of the context section.

2. Parse Each Subsection

For each subsection (H3 heading ###), parse the markdown table that follows:

Code Files Table Format

markdown

| File Path | Purpose | Read Priority |
|-----------|---------|---------------|
| `path/to/file` | Description | High/Medium/Low |

Extract into:

json

{
  "path": "path/to/file",
  "purpose": "Description",
  "priority": "High|Medium|Low"
}

Docs / Specs Table Format

markdown

| Document | Link | Key Sections |
|----------|------|--------------|
| Doc Name | `docs/path` or URL | Sections |

Extract into:

json

{
  "title": "Doc Name",
  "link": "docs/path or URL",
  "key_sections": "Sections"
}

Examples Table Format

markdown

| Example | Location | Relevance to This Feature |
|---------|----------|---------------------------|
| Example Name | `examples/path` | Description |

Extract into:

json

{
  "name": "Example Name",
  "location": "examples/path",
  "relevance": "Description"
}

Gotchas / Prior Failures Table Format

markdown

| Gotcha | Impact | Mitigation | Source |
|--------|--------|------------|--------|
| Issue | What happens | How to fix | Reference |

Extract into:

json

{
  "issue": "Issue",
  "impact": "What happens",
  "mitigation": "How to fix",
  "source": "Reference"
}

External Systems / APIs Table Format

markdown

| System / API | Type | Documentation | Notes |
|--------------|------|---------------|-------|
| System Name | REST/GraphQL/etc | Link | Details |

Extract into:

json

{
  "name": "System Name",
  "type": "REST|GraphQL|gRPC|Database|etc",
  "documentation": "Link",
  "notes": "Details"
}

3. Handle Empty Sections

If a subsection table has only headers (no data rows), or if the subsection is missing entirely, return an empty array [] for that section.

4. Clean Up Markdown Formatting

Remove backticks from file paths and code references
Trim whitespace from all fields
Convert inline code markers to plain text
Preserve newlines in multi-line fields as \n

Output Format

Return a JSON object with the following structure:

json

{
  "code_files": [
    {
      "path": "src/flowspec_cli/commands/specify.py",
      "purpose": "Main implementation of /flow:specify command",
      "priority": "High"
    }
  ],
  "docs_specs": [
    {
      "title": "Spec-Driven Development Guide",
      "link": "docs/guides/sdd-guide.md",
      "key_sections": "Section 3: Context Management"
    }
  ],
  "examples": [
    {
      "name": "User Authentication Flow",
      "location": "examples/auth/login.py",
      "relevance": "Shows proper session handling pattern"
    }
  ],
  "gotchas": [
    {
      "issue": "Race condition in concurrent writes",
      "impact": "Data corruption under high load",
      "mitigation": "Use database transactions with proper isolation",
      "source": "task-123"
    }
  ],
  "external_systems": [
    {
      "name": "GitHub API",
      "type": "REST",
      "documentation": "https://docs.github.com/rest",
      "notes": "Rate limit: 5000 req/hour, requires PAT"
    }
  ]
}

Error Handling

If the PRD file cannot be read or parsed:

Return an error object: {"error": "Description of error"}
Include the file path in the error message
Suggest remediation steps if applicable

Common Error Cases

File not found: {"error": "PRD file not found: {path}. Verify the file exists."}
No context section: {"error": "PRD missing 'All Needed Context' section. Add section to PRD."}
Malformed table: {"error": "Malformed table in section '{section_name}'. Check markdown syntax."}

Usage Example

Input PRD Excerpt

markdown

## All Needed Context

### Code Files

| File Path | Purpose | Read Priority |
|-----------|---------|---------------|
| `src/flowspec_cli/commands/specify.py` | Main /flow:specify implementation | High |
| `templates/prd-template.md` | PRD template structure | Medium |

### Docs / Specs

| Document | Link | Key Sections |
|----------|------|--------------|
| SDD Guide | `docs/guides/sdd-guide.md` | Context Management |

### Examples

| Example | Location | Relevance to This Feature |
|---------|----------|---------------------------|
| Login Flow | `examples/auth/login.py` | Session handling pattern |

### Gotchas / Prior Failures

| Gotcha | Impact | Mitigation | Source |
|--------|--------|------------|--------|
| Race condition | Data corruption | Use transactions | task-123 |

### External Systems / APIs

| System / API | Type | Documentation | Notes |
|--------------|------|---------------|-------|
| GitHub API | REST | https://docs.github.com/rest | 5000 req/hour limit |

Output JSON

json

{
  "code_files": [
    {
      "path": "src/flowspec_cli/commands/specify.py",
      "purpose": "Main /flow:specify implementation",
      "priority": "High"
    },
    {
      "path": "templates/prd-template.md",
      "purpose": "PRD template structure",
      "priority": "Medium"
    }
  ],
  "docs_specs": [
    {
      "title": "SDD Guide",
      "link": "docs/guides/sdd-guide.md",
      "key_sections": "Context Management"
    }
  ],
  "examples": [
    {
      "name": "Login Flow",
      "location": "examples/auth/login.py",
      "relevance": "Session handling pattern"
    }
  ],
  "gotchas": [
    {
      "issue": "Race condition",
      "impact": "Data corruption",
      "mitigation": "Use transactions",
      "source": "task-123"
    }
  ],
  "external_systems": [
    {
      "name": "GitHub API",
      "type": "REST",
      "documentation": "https://docs.github.com/rest",
      "notes": "5000 req/hour limit"
    }
  ]
}

Integration Points

/flow:implement

Uses extracted context to:

Identify files to read before implementation
Prioritize reading order (High → Medium → Low)
Discover related documentation
Warn about gotchas early

/flow:generate-prp

Uses extracted context to:

Build comprehensive context bundles
Include all relevant files and docs
Attach examples for reference
Warn about known failure modes

/flow:validate

Uses extracted context to:

Verify all referenced files exist
Check that documentation is up-to-date
Validate against known gotchas
Test external system integrations

Validation Checklist

After parsing, verify:

All five sections present in output (even if empty)
File paths are clean (no backticks or extra quotes)
Priorities are valid (High/Medium/Low only)
JSON is valid and properly formatted
No markdown artifacts in extracted text
Empty sections return [] not null

Quality Standards

Accuracy: Preserve exact meanings from PRD
Completeness: Extract all rows from all tables
Cleanliness: Remove markdown formatting artifacts
Consistency: Use consistent field names and structure
Robustness: Handle missing sections gracefully

Search AI Tools

context-extractor

Install this agent skill to your Project

SKILL.md

Context Extractor Skill

When to Use This Skill

Input Format

Parsing Instructions

1. Locate the "All Needed Context" Section

2. Parse Each Subsection

Code Files Table Format

Docs / Specs Table Format

Examples Table Format

Gotchas / Prior Failures Table Format

External Systems / APIs Table Format

3. Handle Empty Sections

4. Clean Up Markdown Formatting

Output Format

Error Handling

Common Error Cases

Usage Example

Input PRD Excerpt

Output JSON

Integration Points

/flow:implement

/flow:generate-prp

/flow:validate

Validation Checklist

Quality Standards