Agent skill
codebase-analyzer
Statistical rule discovery from Go codebase patterns.
Install this agent skill to your Project
npx add-skill https://github.com/notque/claude-code-toolkit/tree/main/skills/codebase-analyzer
SKILL.md
Codebase Analyzer Skill
Statistical rule discovery through measurement of Go codebases. Python scripts count patterns to avoid LLM training bias, then statistics are interpreted to derive confidence-scored rules. The core principle is Measure First, Interpret Second -- what IS in the code is the local standard, not what an LLM thinks "should be" there.
Instructions
Phase 1: CONFIGURE
Goal: Validate target and select analyzer variant.
Read and follow the repository's CLAUDE.md before doing anything else -- project instructions override default behaviors.
Step 1: Validate the target
- Confirm path points to a Go repository root with .go files
- Check for standard structure (cmd/, internal/, pkg/)
- Verify sufficient file count: 50+ files for meaningful rules, 100+ ideal. Below 50 files, statistics produce high variance -- patterns that look consistent may be coincidence. For small repos, combine analysis across multiple team repos rather than treating thin data as definitive.
Step 2: Select cartographer variant
| Variant | Script | Metrics | Use When |
|---|---|---|---|
| Omni (recommended) | cartographer_omni.py (not yet implemented) |
100 across 25 categories | Full codebase profiling |
| Basic | cartographer.py (not yet implemented) |
~15 categories | Quick pattern overview |
| Ultimate | cartographer_ultimate.py |
6 focused categories | Performance pattern detection |
Step 3: Verify environment
- Python 3.7+ available
- No external dependencies needed (uses only Python standard library)
- Output directories exist or can be created
===============================================================
PHASE 1: CONFIGURE
===============================================================
Target Repository:
- Path: [/path/to/repo]
- Go Files: [N files found]
- Structure: [cmd/ | internal/ | pkg/ | flat]
Variant Selected: [Omni | Basic | Ultimate]
Reason: [why this variant]
Validation:
- [ ] Path exists and contains .go files
- [ ] File count >= 50 (actual: N)
- [ ] Python 3.7+ available
- [ ] Output directory writable
CONFIGURE complete. Proceeding to MEASURE...
===============================================================
Gate: Target directory exists, contains 50+ Go files, variant selected. Proceed only when gate passes.
Phase 2: MEASURE
Goal: Run statistical analysis scripts. Pure measurement -- no interpretation yet.
This phase is strictly mechanical. Scripts count and measure; keep interpretation separate from data collection. Combining measurement with interpretation introduces LLM training bias -- the model reports what "should be" instead of what IS. Run scripts first, interpret the numbers second, always as separate steps.
Automatically filter vendor/, testdata/, and generated code (files with "Code generated by..." markers) to avoid polluting statistics with external patterns.
Step 1: Execute the cartographer
# TODO: scripts/cartographer_omni.py not yet implemented
# Manual alternative: use grep/find to count patterns across Go files
# Example: count error wrapping patterns
grep -rn 'fmt.Errorf.*%w' ~/repos/my-project --include="*.go" | wc -l
# Example: count constructor patterns
grep -rn 'func New' ~/repos/my-project --include="*.go" | wc -l
Always run the cartographer scripts for measurement; reserve LLM interpretation for Phase 3. When an LLM sees return err it may report "not wrapping errors properly" even if that IS the local standard. The scripts produce deterministic, reproducible counts; the LLM's role begins at interpretation in Phase 3.
Step 2: Verify output integrity
- Confirm JSON output is valid and complete
- Check file count matches expectations (no vendor pollution)
- Verify all three lenses produced data
- Confirm derived_rules section exists in output
Step 3: Check for data quality issues
- File count suspiciously high? Vendor code may be included
- File count suspiciously low? Subdirectories may be missed
- All percentages near 50%? May indicate mixed codebase or insufficient data
===============================================================
PHASE 2: MEASURE
===============================================================
Script Executed: [cartographer_omni.py (not yet implemented — use manual pattern counting)]
Target: [/path/to/repo]
Results:
- Files analyzed: [N]
- Total lines: [N]
- Categories measured: [N of 25]
- Derived rules: [N auto-extracted]
Data Quality:
- [ ] JSON output valid
- [ ] File count reasonable (no vendor pollution)
- [ ] All three lenses have data
- [ ] No unexpected zeros in major categories
Output saved to: [path/to/output.json]
MEASURE complete. Proceeding to INTERPRET...
===============================================================
Gate: Script completed without errors, JSON output is valid, file count is reasonable. Proceed only when gate passes.
Phase 3: INTERPRET
Goal: Derive rules from statistics. This is where LLM interpretation happens -- AFTER measurement is complete.
Report facts and show complete statistics rather than describing them. Report facts without editorializing about code quality -- the numbers speak for themselves.
Step 1: Review the three lenses
| Lens | Question | Measures |
|---|---|---|
| Consistency (Frequency) | "How often do they use X?" | Imports, test frameworks, logging, modern features |
| Signature (Structure) | "How do they name/structure things?" | Constructors, receivers, parameter order, variables |
| Idiom (Implementation) | "How do they implement patterns?" | Error handling, control flow, context usage, defer |
For detailed lens explanations, see references/three-lenses.md.
Step 2: Extract rules by confidence
Only derive rules from patterns with sufficient consistency. Forcing rules from weak patterns causes false positives in reviews and may impose standards the team has not organically adopted.
| Confidence | Threshold | Action | Example |
|---|---|---|---|
| HIGH | >85% consistency | Extract as enforceable rule | "96% use err not e" -> MUST use err |
| MEDIUM | 70-85% consistency | Extract as recommendation | "78% guard clauses" -> SHOULD prefer guards |
| Below 70% | Not extracted as rule | Report as observation only | "55% single-letter receivers" -> No rule |
Step 3: Review Style Vector (Omni only)
- 10 composite scores (0-100): Consistency, Modernization, Safety, Idiomaticity, Documentation, Testing Maturity, Architecture, Performance, Observability, Production Readiness
- Identify strengths (scores >75) and gaps (scores <50)
- Note shadow constitution entries (accepted linter suppressions)
Step 4: Cross-reference lenses
- Pattern confirmed across multiple lenses = higher confidence
- Pattern in one lens only = standard confidence
- Contradictions between lenses = investigate further
Gate: Rules extracted with evidence and confidence levels. Style Vector reviewed. Proceed only when gate passes.
Phase 4: DELIVER
Goal: Produce actionable output artifacts.
Step 1: Save statistical report
cartography_data/{repo_name}_cartography.json
Step 2: Generate derived rules document
derived_rules/{repo_name}_rules.md
Format each rule as:
## Rule: [Statement]
**Confidence**: HIGH/MEDIUM
**Evidence**: [X% consistency across N occurrences]
**Category**: [error_handling | naming | control_flow | architecture | ...]
**Lens**: [Consistency | Signature | Idiom | Multiple]
Step 3: Summarize Style Vector (Omni only)
## Style Vector Summary
| Dimension | Score | Assessment |
|-----------|-------|------------|
| Consistency | [0-100] | [Strength/Gap/Neutral] |
| Modernization | [0-100] | [Strength/Gap/Neutral] |
| ... | ... | ... |
Step 4: Recommend next steps
- Compare with pr-workflow (miner) data if available (explicit vs implicit rules)
- Suggest CLAUDE.md updates for high-confidence rules
- Identify golangci-lint rules that could enforce discovered patterns
- Suggest quarterly re-analysis schedule -- coding patterns evolve with team growth and new Go versions, so a one-time snapshot becomes stale within months
===============================================================
PHASE 4: DELIVER
===============================================================
Artifacts:
- [ ] JSON report: [path]
- [ ] Rules document: [path]
- [ ] Style Vector summary: [included in rules doc]
Results Summary:
- HIGH confidence rules: [N]
- MEDIUM confidence rules: [N]
- Observations (below threshold): [N]
- Style Vector overall: [strong/mixed/weak]
Next Steps:
1. [Specific recommendation]
2. [Specific recommendation]
3. [Specific recommendation]
DELIVER complete. Analysis finished.
===============================================================
Gate: JSON report saved, rules document generated, next steps documented. Analysis complete.
Complementary Skills
| Skill | Extracts | Combined Value |
|---|---|---|
| pr-workflow (miner) | Explicit rules (what people argue about in reviews) | Agreement = HIGH confidence; Silence + consistency = implicit rule |
| codebase-analyzer | Implicit rules (what they actually do) | pr-workflow (miner) says X but code does Y = rule not followed |
Reconciliation Matrix
| pr-workflow (miner) | codebase-analyzer | Conclusion |
|---|---|---|
| Says X | Shows X at >85% | Confirmed rule (both explicit and practiced) |
| Silent | Shows X at >85% | Implicit rule (nobody argues because everyone agrees) |
| Says X | Shows Y at >85% | Rule stated but not followed (needs enforcement or is outdated) |
| Mixed signals | Inconsistent | No standard yet (opportunity to establish one) |
Examples
Example 1: Single Repository Analysis
User says: "What conventions does this repo follow?" Actions:
- Validate target has 100+ Go files (CONFIGURE)
- Run pattern counting against the repo (MEASURE)
- Extract rules from statistics: error wrapping 89%, guard clauses 5.2x, New{Type} 94% (INTERPRET)
- Save JSON report and rules document (DELIVER) Result: 30+ rules extracted with confidence levels, Style Vector produced
Example 2: Team-Wide Standards Discovery
User says: "Find our team's coding patterns across all services" Actions:
- Validate all target repos, confirm 50+ files each (CONFIGURE)
- Run cartographer on each repo separately (MEASURE)
- Cross-reference patterns: error wrapping 87-91% across all repos = team standard (INTERPRET)
- Produce team-wide rules document with per-repo breakdowns (DELIVER) Result: Team-wide standards with cross-repo evidence
Example 3: Onboarding New Developer
User says: "I just joined the team, what coding patterns should I follow?" Actions:
- Identify main team repos, validate Go file counts (CONFIGURE)
- Run omni-cartographer on primary service (MEASURE)
- Extract top 10 HIGH confidence rules as onboarding checklist (INTERPRET)
- Produce concise rules doc focusing on error handling, naming, and control flow (DELIVER) Result: Evidence-based onboarding guide with concrete examples from actual codebase
Error Handling
Error: "No Go files found"
Cause: Path does not point to a Go repository root, or .go files are in subdirectories not being scanned Solution:
- Verify path points to repository root with
ls *.goorfind . -name "*.go" | head - If Go files are nested, point to parent directory
- Confirm vendor/ is not the only directory containing Go files
Error: "No rules derived"
Cause: Codebase too small (<50 files) or patterns genuinely inconsistent Solution:
- Check file count -- if <50, combine analysis across multiple repos from same team
- If >50 files but no rules, team genuinely lacks consistent patterns
- Lower threshold to 60% to find emerging patterns (note reduced confidence)
Error: "Statistics dominated by vendor/generated code"
Cause: Vendor directory or generated files not filtered, polluting pattern data Solution:
- Verify scripts are filtering vendor/, testdata/, and _test files for core patterns
- If non-standard structure, analyze specific directories manually
- Check for generated code markers (Code generated by...) and exclude those files
References
Reference Files
${CLAUDE_SKILL_DIR}/references/three-lenses.md: Detailed explanation of the three analysis lenses${CLAUDE_SKILL_DIR}/references/examples.md: Real-world analysis examples and workflows${CLAUDE_SKILL_DIR}/references/metrics-catalog.md: Complete 100-metric catalog across 25 categories
Prerequisites
- Python 3.7+
- Go codebase to analyze (50+ files recommended)
- No external dependencies (uses only Python standard library)
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
voice-writer
Unified voice content generation pipeline with mandatory validation and joy-check. 9-phase pipeline: LOAD, GROUND, GENERATE, VALIDATE, REFINE, JOY-CHECK, OUTPUT, CLEANUP. Use when writing articles, blog posts, or any content that uses a voice profile. Use for "write article", "blog post", "write in voice", "generate content", "draft article", "write about".
image-auditor
Non-destructive image validation for accessibility and health.
video-editing
Video editing pipeline: cut footage, assemble clips via FFmpeg and Remotion.
comment-quality
Review and fix temporal references in code comments.
e2e-testing
Playwright-based end-to-end testing workflow.
anti-ai-editor
Remove AI-sounding patterns from content.
Didn't find tool you were looking for?