Agent skill
review-llm-artifacts
Detects common LLM coding agent artifacts by spawning 4 parallel subagents
Install this agent skill to your Project
npx add-skill https://github.com/existential-birds/beagle/tree/main/plugins/beagle-core/skills/review-llm-artifacts
SKILL.md
LLM Artifacts Review
Detect common artifacts left behind by LLM coding agents: over-abstraction, dead code, DRY violations in tests, verbose comments, and defensive overkill.
Arguments
--all: Scan entire codebase (default: changed files from main)--parallel: Force parallel execution (default when 4+ files)- Path: Target directory (default: current working directory)
Step 1: Determine Scope
Parse $ARGUMENTS for flags and path:
# Default: changed files from main
git diff --name-only $(git merge-base HEAD main)..HEAD | grep -E '\.(py|ts|tsx|js|jsx|go|rs|java|rb|swift|kt)$'
# If --all flag: scan entire codebase
find . -type f \( -name "*.py" -o -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" -o -name "*.go" -o -name "*.rs" -o -name "*.java" -o -name "*.rb" -o -name "*.swift" -o -name "*.kt" \) ! -path "*/node_modules/*" ! -path "*/.git/*" ! -path "*/vendor/*" ! -path "*/__pycache__/*"
If no files found, exit with: "No files to scan. Check your branch has changes or use --all to scan the entire codebase."
Step 2: Detect Languages
Extract unique file extensions from the file list:
# Get unique extensions
echo "$FILES" | sed 's/.*\.//' | sort -u
Map extensions to language names for the report:
.py-> Python.ts,.tsx-> TypeScript.js,.jsx-> JavaScript.go-> Go.rs-> Rust.java-> Java.rb-> Ruby.swift-> Swift.kt-> Kotlin
Step 3: Spawn Parallel Subagents
If file count >= 4 OR --parallel flag is set, spawn 4 subagents via Task tool.
Each subagent MUST:
- Load the skill:
Skill(skill: "beagle-core:llm-artifacts-detection") - Review only its assigned category
- Return findings in the structured format below
Subagent 1: Tests Agent
Focus: Testing anti-patterns from LLM generation
- DRY violations (repeated setup code, duplicate assertions)
- Testing library/framework code instead of application logic
- Wrong mock boundaries (mocking too much or too little)
- Overly verbose test names that describe implementation
- Tests that just mirror the implementation
Subagent 2: Dead Code Agent
Focus: Unused or obsolete code
- Unused imports, variables, functions, classes
- TODO/FIXME comments that should have been resolved
- Backwards compatibility code for removed features
- Orphaned test files for deleted code
- Commented-out code blocks
- Feature flags that are always on/off
Subagent 3: Abstraction Agent
Focus: Over-engineering patterns
- Unnecessary abstraction layers (interfaces for single implementations)
- Copy-paste drift (similar code that diverged slightly)
- Over-configuration (configurable things that never change)
- Premature generalization
- Factory/Builder patterns for simple object creation
- Deep inheritance hierarchies
Subagent 4: Style Agent
Focus: Verbose or defensive patterns
- Verbose comments explaining obvious code
- Defensive overkill (null checks on non-nullable values)
- Unnecessary type hints (dynamic languages with obvious types)
- Overly explicit error messages
- Redundant logging
- Self-documenting code with documentation
Step 4: Consolidate Findings
Wait for all subagents to complete, then:
- Merge all findings into a single list
- Assign unique IDs (1, 2, 3...)
- Group by category for display
Step 5: Write JSON Report
Create .beagle directory if it doesn't exist:
mkdir -p .beagle
Write findings to .beagle/llm-artifacts-review.json:
{
"version": "1.0.0",
"created_at": "2024-01-15T10:30:00Z",
"git_head": "abc1234",
"scope": "changed" | "all",
"files_scanned": 42,
"languages": ["Python", "TypeScript", "Go"],
"findings": [
{
"id": 1,
"category": "tests" | "dead_code" | "abstraction" | "style",
"type": "dry_violation" | "unused_import" | "over_abstraction" | "verbose_comment" | ...,
"file": "src/utils/helper.py",
"line": 42,
"description": "Repeated setup code in 5 test functions",
"suggestion": "Extract to a pytest fixture",
"risk": "Low" | "Medium" | "High",
"fix_safety": "Safe" | "Needs review",
"fix_action": "refactor" | "delete" | "simplify" | "extract"
}
],
"summary": {
"total": 15,
"by_category": {
"tests": 4,
"dead_code": 5,
"abstraction": 3,
"style": 3
},
"by_risk": {
"High": 2,
"Medium": 8,
"Low": 5
},
"by_fix_safety": {
"Safe": 10,
"Needs review": 5
}
}
}
Step 6: Display Summary
## LLM Artifacts Review
**Scope:** Changed files from main | Entire codebase
**Files scanned:** 42
**Languages:** Python, TypeScript, Go
### Findings by Category
#### Tests (4 issues)
1. [src/tests/test_api.py:15] **DRY violation** (Medium, Safe)
- Repeated setup code in 5 test functions
- Suggestion: Extract to a pytest fixture
2. [src/tests/test_utils.py:42] **Wrong mock boundary** (High, Needs review)
- Mocking internal implementation details
- Suggestion: Mock at the adapter boundary instead
#### Dead Code (5 issues)
3. [src/utils/legacy.py:1] **Unused module** (Low, Safe)
- Module imported nowhere in codebase
- Suggestion: Delete file
...
#### Abstraction (3 issues)
...
#### Style (3 issues)
...
### Summary Table
| Category | Safe Fixes | Needs Review | Total |
|----------|------------|--------------|-------|
| Tests | 3 | 1 | 4 |
| Dead Code | 4 | 1 | 5 |
| Abstraction | 2 | 1 | 3 |
| Style | 1 | 2 | 3 |
| **Total** | **10** | **5** | **15** |
### Next Steps
- Run `/beagle-core:review-llm-artifacts --fix` to auto-fix Safe issues (coming soon)
- Review the JSON report at `.beagle/llm-artifacts-review.json`
Step 7: Verification
Before completing, verify the review executed correctly:
- JSON validity: Confirm
.beagle/llm-artifacts-review.jsonexists and is parseable - Subagent success: All 4 subagents completed without errors
- Git HEAD captured: The
git_headfield is non-empty in the report - Staleness check: If a previous report exists, compare stored
git_headto current HEAD and warn if different
# Verify JSON is valid
python3 -c "import json; json.load(open('.beagle/llm-artifacts-review.json'))" 2>/dev/null && echo "✓ Valid JSON" || echo "✗ Invalid JSON"
# Check for staleness (if previous report exists)
STORED_HEAD=$(jq -r '.git_head' .beagle/llm-artifacts-review.json 2>/dev/null)
CURRENT_HEAD=$(git rev-parse --short HEAD)
if [ "$STORED_HEAD" != "$CURRENT_HEAD" ]; then
echo "⚠️ Report was generated on $STORED_HEAD, current HEAD is $CURRENT_HEAD"
fi
If any verification fails, report the error and do not proceed.
Output Format for Each Finding
[FILE:LINE] **ISSUE_TYPE** (Risk, Fix Safety)
- Description
- Suggestion: Specific fix recommendation
Rules
- Always load the
beagle-core:llm-artifacts-detectionskill first - Use
Tasktool for parallel subagents when >= 4 files - Every finding MUST have file:line reference
- Categorize risk honestly (don't inflate or deflate)
- Mark fix safety as "Safe" only if change is mechanical and reversible
- Create
.beagledirectory if needed - Write JSON report before displaying summary
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
review-python
Comprehensive Python/FastAPI backend code review with optional parallel agents
review-verification-protocol
Mandatory verification steps for all code reviews to reduce false positives. Load this skill before reporting ANY code review findings.
sqlalchemy-code-review
Reviews SQLAlchemy code for session management, relationships, N+1 queries, and migration patterns. Use when reviewing SQLAlchemy 2.0 code, checking session lifecycle, relationship() usage, or Alembic migrations.
fastapi-code-review
Reviews FastAPI code for routing patterns, dependency injection, validation, and async handlers. Use when reviewing FastAPI apps, checking APIRouter setup, Depends() usage, or response models.
pytest-code-review
Reviews pytest test code for async patterns, fixtures, parametrize, and mocking. Use when reviewing test_*.py files, checking async test functions, fixture usage, or mock patterns.
postgres-code-review
Reviews PostgreSQL code for indexing strategies, JSONB operations, connection pooling, and transaction safety. Use when reviewing SQL queries, database schemas, JSONB usage, or connection management.
Didn't find tool you were looking for?