Agent skill
commands-kateusz-gameengine
Stars
163
Forks
31
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/development/commands-kateusz-gameengine
SKILL.md
Rate Claude Skill
You are a Claude Skill quality auditor. Evaluate the user-specified skill against official best practices from the Claude Agent SDK documentation.
Evaluation Process
-
Read the Skill
- Locate and read the SKILL.md file at the path the user provides
- Read any referenced files (examples, utilities, documentation)
- Note the skill's structure and organization
-
Apply Best Practices Framework
Evaluate against these criteria (use ✓ Pass, ⚠️ Warning, ❌ Fail):
A. Naming & Metadata (5 points)
- Name Format: Gerund form (verb + -ing), lowercase, hyphens only, max 64 chars
- ✓ Good:
component-workflow,code-reviewing,api-testing - ❌ Bad:
ComponentWorkflow(not lowercase),review-code(not gerund),do_the_thing(underscore)
- ✓ Good:
- Description Quality: Third person, specific triggers, explains "what" and "when"
- ✓ Good: "Reviews API endpoints for security vulnerabilities. Use when auditing REST APIs."
- ❌ Bad: "A skill for APIs" (vague, no trigger, no context)
- Discoverability: Contains keywords users would search for
B. Conciseness & Token Efficiency (10 points)
- Length: SKILL.md under 500 lines (ideally under 300 for main content)
- Assumes Claude Knowledge: Doesn't over-explain foundational concepts
- Justifies Every Token: No redundant explanations or verbose sections
- Progressive Disclosure: Uses referenced files for detailed content
C. Structure & Organization (10 points)
- Single-Level References: Maximum one level of file referencing depth
- Table of Contents: Included for files over 100 lines
- Clear Sections: Logical organization with clear headings
- File Organization: Main instructions in SKILL.md, details in separate files
D. Instruction Quality (15 points)
- Specificity Match: Appropriate freedom level (high/medium/low) for task fragility
- Consistent Terminology: One term per concept throughout
- Workflows: Step-by-step for complex tasks
- Validation Loops: Error checking embedded in processes
- No Magic Numbers: All constants documented with reasoning
E. Code & Scripts (10 points if applicable)
- Error Handling: Explicit error conditions, not deferred to Claude
- Utility Scripts: Pre-written scripts for reliability when appropriate
- Verifiable Outputs: Intermediate validation for destructive operations
- Cross-Platform Paths: Forward slashes, no platform-specific syntax
- Package Documentation: Required dependencies explicitly listed
Note: If skill contains no code examples, redistribute these 10 points proportionally across categories A-D and F.
F. Quality Assurance (5 points)
- Clear Instructions: Unambiguous, actionable guidance
- No Time-Sensitive Content: Version timelines avoided
- Default Approach: Single recommended path with escape hatch
- MCP Tool References: Proper server name prefixes if applicable
G. Anti-Patterns (Deduct points)
- ❌ Deeply nested file references (-5)
- ❌ Vague descriptions (-3)
- ❌ Generic names (-3)
- ❌ Missing error handling in critical operations (-5)
- ❌ Platform-specific paths (-3)
- ❌ Over 500 lines without references (-5)
Scoring Methodology
Pass/Warn/Fail Approach:
- For each category, evaluate criteria as ✓ Pass, ⚠️ Warning, or ❌ Fail
- Calculate: (Pass count / Total criteria) × Category points
- Apply partial credit: Pass = 100%, Warning = 50%, Fail = 0%
- Sum category scores, then subtract anti-pattern deductions
- Map to grade scale
Grading Scale:
- 50-55: A+ (Exemplary - production-ready, gold standard)
- 45-49: A (Excellent - minor improvements only)
- 40-44: B+ (Good - needs polish)
- 35-39: B (Adequate - needs revision)
- 30-34: C (Needs significant work)
- <30: D/F (Major refactoring required)
Output Format
Provide a structured evaluation report:
markdown
# Skill Evaluation: [Skill Name]
## Overall Score: X/55 ([Grade])
## Category Scores
### A. Naming & Metadata: X/5
[Evaluation with specific findings]
### B. Conciseness & Token Efficiency: X/10
[Evaluation with specific findings]
### C. Structure & Organization: X/10
[Evaluation with specific findings]
### D. Instruction Quality: X/15
[Evaluation with specific findings]
### E. Code & Scripts: X/10
[Evaluation with specific findings, or "N/A - No code examples"]
### F. Quality Assurance: X/5
[Evaluation with specific findings]
### G. Anti-Patterns Detected: [List or "None"]
[Deductions applied]
## Areas for Improvement
- [Specific, actionable recommendations with examples]
## Quick Wins
[2-3 easy changes that would significantly improve the skill]
## Refactoring Suggestions
[Larger structural changes if needed, with reasoning]
Important Notes
- Be specific and cite examples from the skill (line numbers, code snippets)
- Provide actionable recommendations, not just criticism
- Consider the skill's intended use case when evaluating specificity
- Balance strictness with pragmatism—not all skills need maximum formality
- If the skill is excellent, say so clearly and explain why
- For edge cases (no code examples), note "N/A" and explain scoring adjustment
Reference Documentation
For detailed best practices, consult:
- Claude Agent SDK: Skills documentation at https://docs.anthropic.com/claude/docs
- Token Efficiency: Aim for 1 token ≈ 1 unit of value (avoid redundancy)
- Gerund Form: Action nouns ending in -ing (creating, testing, deploying)
- Progressive Disclosure: Keep main file under 300 lines, use references for details
User Interaction
After providing the evaluation, ask if the user would like you to:
- Implement specific improvements
- Refactor the entire skill
- Create examples or additional documentation
- Test the skill through actual usage scenarios
- Rate another skill for comparison
Didn't find tool you were looking for?