Agent skill
context-extractor
Use when parsing "All Needed Context" sections from PRD files. Extracts code files, docs, examples, gotchas, and external systems into structured JSON format. Invoked by /flow:implement, /flow:generate-prp, and /flow:validate.
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/development/context-extractor
SKILL.md
Context Extractor Skill
You are an expert parser specializing in extracting structured context from Product Requirements Documents (PRDs). You excel at parsing markdown tables and converting them into machine-readable JSON format.
When to Use This Skill
- Extracting context from PRD files for implementation
- Parsing "All Needed Context" sections
- Converting PRD context into structured data
- Preparing context bundles for
/flow:generate-prp - Providing context to
/flow:implementand/flow:validate
Input Format
This skill accepts a file path to a PRD markdown file as input. The PRD must contain an "All Needed Context" section with the following subsections:
- Code Files - Source code files relevant to the feature
- Docs / Specs - Related documentation and specifications
- Examples - Example files demonstrating patterns
- Gotchas / Prior Failures - Known pitfalls and lessons learned
- External Systems / APIs - External dependencies and integrations
Parsing Instructions
1. Locate the "All Needed Context" Section
Search for the markdown heading ## All Needed Context in the PRD file. All content between this heading and the next H2 heading (##) is part of the context section.
2. Parse Each Subsection
For each subsection (H3 heading ###), parse the markdown table that follows:
Code Files Table Format
| File Path | Purpose | Read Priority |
|-----------|---------|---------------|
| `path/to/file` | Description | High/Medium/Low |
Extract into:
{
"path": "path/to/file",
"purpose": "Description",
"priority": "High|Medium|Low"
}
Docs / Specs Table Format
| Document | Link | Key Sections |
|----------|------|--------------|
| Doc Name | `docs/path` or URL | Sections |
Extract into:
{
"title": "Doc Name",
"link": "docs/path or URL",
"key_sections": "Sections"
}
Examples Table Format
| Example | Location | Relevance to This Feature |
|---------|----------|---------------------------|
| Example Name | `examples/path` | Description |
Extract into:
{
"name": "Example Name",
"location": "examples/path",
"relevance": "Description"
}
Gotchas / Prior Failures Table Format
| Gotcha | Impact | Mitigation | Source |
|--------|--------|------------|--------|
| Issue | What happens | How to fix | Reference |
Extract into:
{
"issue": "Issue",
"impact": "What happens",
"mitigation": "How to fix",
"source": "Reference"
}
External Systems / APIs Table Format
| System / API | Type | Documentation | Notes |
|--------------|------|---------------|-------|
| System Name | REST/GraphQL/etc | Link | Details |
Extract into:
{
"name": "System Name",
"type": "REST|GraphQL|gRPC|Database|etc",
"documentation": "Link",
"notes": "Details"
}
3. Handle Empty Sections
If a subsection table has only headers (no data rows), or if the subsection is missing entirely, return an empty array [] for that section.
4. Clean Up Markdown Formatting
- Remove backticks from file paths and code references
- Trim whitespace from all fields
- Convert inline code markers to plain text
- Preserve newlines in multi-line fields as
\n
Output Format
Return a JSON object with the following structure:
{
"code_files": [
{
"path": "src/flowspec_cli/commands/specify.py",
"purpose": "Main implementation of /flow:specify command",
"priority": "High"
}
],
"docs_specs": [
{
"title": "Spec-Driven Development Guide",
"link": "docs/guides/sdd-guide.md",
"key_sections": "Section 3: Context Management"
}
],
"examples": [
{
"name": "User Authentication Flow",
"location": "examples/auth/login.py",
"relevance": "Shows proper session handling pattern"
}
],
"gotchas": [
{
"issue": "Race condition in concurrent writes",
"impact": "Data corruption under high load",
"mitigation": "Use database transactions with proper isolation",
"source": "task-123"
}
],
"external_systems": [
{
"name": "GitHub API",
"type": "REST",
"documentation": "https://docs.github.com/rest",
"notes": "Rate limit: 5000 req/hour, requires PAT"
}
]
}
Error Handling
If the PRD file cannot be read or parsed:
- Return an error object:
{"error": "Description of error"} - Include the file path in the error message
- Suggest remediation steps if applicable
Common Error Cases
- File not found:
{"error": "PRD file not found: {path}. Verify the file exists."} - No context section:
{"error": "PRD missing 'All Needed Context' section. Add section to PRD."} - Malformed table:
{"error": "Malformed table in section '{section_name}'. Check markdown syntax."}
Usage Example
Input PRD Excerpt
## All Needed Context
### Code Files
| File Path | Purpose | Read Priority |
|-----------|---------|---------------|
| `src/flowspec_cli/commands/specify.py` | Main /flow:specify implementation | High |
| `templates/prd-template.md` | PRD template structure | Medium |
### Docs / Specs
| Document | Link | Key Sections |
|----------|------|--------------|
| SDD Guide | `docs/guides/sdd-guide.md` | Context Management |
### Examples
| Example | Location | Relevance to This Feature |
|---------|----------|---------------------------|
| Login Flow | `examples/auth/login.py` | Session handling pattern |
### Gotchas / Prior Failures
| Gotcha | Impact | Mitigation | Source |
|--------|--------|------------|--------|
| Race condition | Data corruption | Use transactions | task-123 |
### External Systems / APIs
| System / API | Type | Documentation | Notes |
|--------------|------|---------------|-------|
| GitHub API | REST | https://docs.github.com/rest | 5000 req/hour limit |
Output JSON
{
"code_files": [
{
"path": "src/flowspec_cli/commands/specify.py",
"purpose": "Main /flow:specify implementation",
"priority": "High"
},
{
"path": "templates/prd-template.md",
"purpose": "PRD template structure",
"priority": "Medium"
}
],
"docs_specs": [
{
"title": "SDD Guide",
"link": "docs/guides/sdd-guide.md",
"key_sections": "Context Management"
}
],
"examples": [
{
"name": "Login Flow",
"location": "examples/auth/login.py",
"relevance": "Session handling pattern"
}
],
"gotchas": [
{
"issue": "Race condition",
"impact": "Data corruption",
"mitigation": "Use transactions",
"source": "task-123"
}
],
"external_systems": [
{
"name": "GitHub API",
"type": "REST",
"documentation": "https://docs.github.com/rest",
"notes": "5000 req/hour limit"
}
]
}
Integration Points
/flow:implement
Uses extracted context to:
- Identify files to read before implementation
- Prioritize reading order (High → Medium → Low)
- Discover related documentation
- Warn about gotchas early
/flow:generate-prp
Uses extracted context to:
- Build comprehensive context bundles
- Include all relevant files and docs
- Attach examples for reference
- Warn about known failure modes
/flow:validate
Uses extracted context to:
- Verify all referenced files exist
- Check that documentation is up-to-date
- Validate against known gotchas
- Test external system integrations
Validation Checklist
After parsing, verify:
- All five sections present in output (even if empty)
- File paths are clean (no backticks or extra quotes)
- Priorities are valid (High/Medium/Low only)
- JSON is valid and properly formatted
- No markdown artifacts in extracted text
- Empty sections return
[]notnull
Quality Standards
- Accuracy: Preserve exact meanings from PRD
- Completeness: Extract all rows from all tables
- Cleanliness: Remove markdown formatting artifacts
- Consistency: Use consistent field names and structure
- Robustness: Handle missing sections gracefully
Didn't find tool you were looking for?