Agent skill
test-failure-mindset
Use when encountering failing tests, diagnosing test errors, or establishing a systematic approach to test failure investigation. Activates on "test failure analysis", "debugging tests", or "why tests fail" requests. Establishes the mindset that treats test failures as valuable diagnostic signals requiring root-cause investigation — not automatic code fixes or test dismissal.
Install this agent skill to your Project
npx add-skill https://github.com/Jamie-BitFlight/claude_skills/tree/main/plugins/development-harness/skills/test-failure-mindset
SKILL.md
Test Failure Analysis Mindset
Establish a balanced investigative approach for all test failures encountered in this session.
Core Principle
Tests are specifications - they define expected behavior. When they fail, it's a critical moment requiring balanced investigation, not automatic dismissal.
Dual Hypothesis Approach
Always consider both possibilities when a test fails:
| Hypothesis A | Hypothesis B |
|---|---|
| Test expectations are incorrect | Implementation has a bug |
| Test is outdated | Test caught a regression |
| Test has wrong assumptions | Test found an edge case |
Investigation Protocol
For EVERY test failure:
1. Pause and Read
- Understand what the test is trying to verify
- Read its name, comments, and assertions carefully
- Check the test's history (git blame) for context
2. Trace the Implementation
- Follow the code path that leads to the failure
- Understand actual behavior vs. expected behavior
- Check if recent changes affected this code path
3. Consider the Context
- Is this testing a documented requirement?
- Would current behavior surprise a user?
- What would be the impact of each possible fix?
4. Make a Reasoned Decision
| Situation | Action |
|---|---|
| Implementation is wrong | Fix the bug |
| Test is wrong | Fix test AND document why |
| Unclear | Seek clarification before changing |
5. Learn from the Failure
- What can this teach about the system?
- Should additional tests cover related cases?
- Is there a pattern being missed?
Red Flags (Dangerous Patterns)
- Immediately changing tests to match implementation
- Assuming implementation is always correct
- Bulk-updating tests without individual analysis
- Removing "inconvenient" test cases
- Adding mock/stub workarounds instead of fixing root causes
Good Practices
- Treat each test failure as a potential bug discovery
- Document analysis in comments when fixing tests
- Write clear test names that explain intent
- When changing a test, explain why the original was wrong
- Consider adding more tests when finding ambiguity
Example Responses
Good: "I see test_user_validation is failing. Let me trace through the validation logic to understand if this is catching a real bug or if the test's expectations are incorrect."
Bad: "The test is failing so I'll update it to match what the code does."
Remember
Every test failure is an opportunity to:
- Discover and fix a bug before users do
- Clarify ambiguous requirements
- Improve system understanding
- Strengthen the test suite
The goal is NOT to make tests pass quickly. The goal IS to ensure the system behaves correctly.
Related Skills
- analyze-test-failures: Detailed analysis of specific test failures
- comprehensive-test-review: Full test suite review
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
ccc
This skill should be used when code search is needed (whether explicitly requested or as part of completing a task), when indexing the codebase after changes, or when the user asks about ccc, cocoindex-code, or the codebase index. Trigger phrases include 'search the codebase', 'find code related to', 'update the index', 'ccc', 'cocoindex-code'.
agent-browser
Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.
delegate
Quick delegation template for sub-agent prompts. Use when assigning work to a sub-agent, before invoking the Agent tool, or when preparing prompts for specialized agents. Provides the WHERE-WHAT-WHY framework. For comprehensive delegation guidance, activate the agent-orchestration how-to-delegate skill.
swarm-spawning
Spawn agents and teammates in Claude Code swarms. Use when choosing between subagents vs teammates, selecting agent types (Explore, Plan, general-purpose, plugin agents), configuring spawn backends (in-process, tmux, iterm2), or setting environment variables for spawned agents.
knowledge-explorer
Manage the research/ knowledge base (KB) of tool and library research entries. Use when browsing KB topics, adding new research entries, updating existing entries with dated revisions, fetching GitHub repo metadata into a draft KB entry, or migrating old-format entries to skill-spec frontmatter. Triggers on tasks like "what do we have on X", "add this to the KB", "update the KB entry for Y", "fetch github info for owner/repo", or "migrate old entries".
design-anti-patterns
Enforce anti-AI UI design rules based on the Uncodixfy methodology. Use when generating HTML, CSS, React, Vue, Svelte, or any frontend UI code. Prevents "Codex UI" — the generic AI aesthetic of soft gradients, floating panels, oversized rounded corners, glassmorphism, hero sections in dashboards, and decorative copy. Applies constraints from Linear/Raycast/Stripe/GitHub design philosophy: functional, honest, human-designed interfaces. Triggers on: UI generation, dashboard building, frontend component creation, CSS styling, landing page design, or any task producing visual interface code.
Didn't find tool you were looking for?