Agent skills
plugin-dev-workflow

Agent skill

plugin-dev-workflow

Guide plugin development workflow — editing skills, agents, hooks, or eval framework in this repo. Use when modifying files in plugins/elixir-phoenix/, lab/eval/, or lab/autoresearch/. Ensures changes pass eval, lint, and tests before committing.

View SKILL.md on GitHub Repository

Stars 252

Forks 17

Install this agent skill to your Project

npx add-skill https://github.com/oliver-kriska/claude-elixir-phoenix/tree/main/.claude/skills/plugin-dev-workflow

SKILL.md

Plugin Development Workflow

This repo is the Elixir/Phoenix Claude Code plugin. When editing plugin files, follow this workflow to ensure quality.

Before You Start

Run make help to see all available commands:

bash

make eval          # Quick: lint + score changed skills/agents
make eval-all      # Full: all 40 skills + 20 agents
make eval-fix      # Auto-fix + show failures
make test          # 52 pytest tests for eval framework
make ci            # Full CI pipeline

Scoring Individual Files (CLI)

IMPORTANT: Always use -m module syntax, never run scorer.py directly.

bash

# Score ONE skill (use -m, NOT direct file path)
python3 -m lab.eval.scorer plugins/elixir-phoenix/skills/verify/SKILL.md

# Score ONE skill with pretty output
python3 -m lab.eval.scorer plugins/elixir-phoenix/skills/verify/SKILL.md --pretty

# Score all skills
python3 -m lab.eval.scorer --all

# Score ONE agent
python3 -m lab.eval.agent_scorer plugins/elixir-phoenix/agents/verification-runner.md

# Score all agents
python3 -m lab.eval.agent_scorer --all
make ci            # Full CI pipeline

When Editing Skills (plugins/elixir-phoenix/skills/*/SKILL.md)

Read CLAUDE.md conventions (size limits, frontmatter requirements)
Make your changes
Run make eval — it auto-detects changed skills and scores them
If FAIL: check the dimension that failed, fix it
Run make lint to verify markdown formatting
Commit

Skill requirements (eval checks all of these):

Frontmatter: name, description, effort. Description must start with action verb + include "Use when..."
Iron Laws section with 1+ numbered items
Under 185 lines (command skills) or 150 lines (reference skills)
No section exceeds 45 lines
All /phx: references point to existing skills
All references/*.md paths exist
No dangerous code patterns outside Iron Laws sections
Code examples present (1+ fenced code blocks)
"Use when..." in description (for trigger accuracy)

When Editing Agents (plugins/elixir-phoenix/agents/*.md)

Make your changes
Run make eval-agents to score all agents
Agent requirements:
- permissionMode: bypassPermissions (always — background agents need it)
- disallowedTools: Write, Edit, NotebookEdit for review/analysis agents
- model matches effort: haiku=low, sonnet=medium, opus=high
- Under 300 lines (specialist) or 535 lines (orchestrator)

When Editing Eval Framework (lab/eval/*.py)

Make your changes
Run make test — 52 pytest tests must pass
Run make eval-all — verify no skills/agents regressed
If adding new matchers: add tests in lab/eval/tests/test_matchers.py

When Editing Hooks (plugins/elixir-phoenix/hooks/scripts/*.sh)

Make your changes
Run make lint (markdown in hook comments)
Test the hook manually (hooks run on Edit/Write/Bash events)
Check CLAUDE.md hook documentation is still accurate

Autoresearch (Self-Improvement Loop)

If make eval-fix shows failures, it suggests an autoresearch command:

bash

# Copy-paste the suggested command from eval-fix output
claude -p 'Run autoresearch. Score all skills...' --allowedTools 'Edit,Read,Write,Bash,Glob,Grep'

This runs the autoresearch loop: find weakest skill → fix ONE issue → re-score → keep/revert.

Pre-Commit Checklist

Before committing any plugin changes:

make lint passes
make eval passes (changed files)
make test passes (if eval framework changed)
CHANGELOG.md updated (if user-visible change)
Version bumped in plugin.json (if releasing)

References

CLAUDE.md — full conventions, size limits, checklist
lab/eval/ — scoring framework (24 matchers, 8 dimensions)
lab/autoresearch/ — self-improvement loop
lab/findings/interesting.jsonl — log interesting discoveries here

Maintainer

oliver-kriska Core maintainer

Source details

Full Name: oliver-kriska/claude-elixir-phoenix
Branch: main
Path in repo: .claude/skills/plugin-dev-workflow
License: MIT License
Topics: claude-code claude claude-code-skills automation claude-skills vibe-coding claude-code-plugin elixir elixir-phoenix phoenix elixir-lang phoenix-framework

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

oliver-kriska/claude-elixir-phoenix

lab:autoresearch

Self-improving loop for plugin skills. Reads program.md, proposes one mutation per iteration, evaluates against deterministic scorer, keeps improvements via git, reverts failures. Targets weakest skill+dimension. Use with /loop for overnight runs.

252 17

Explore

oliver-kriska/claude-elixir-phoenix

promote

Generate X/Twitter release promotion posts with ASCII tables and CodeSnap rendering. Use when writing release posts, promotion tweets, plugin announcements, or preparing social media content for new versions.

252 17

Explore

oliver-kriska/claude-elixir-phoenix

skill-monitor

Analyze skill effectiveness across sessions. Computes per-skill metrics (action rate, friction, outcomes), identifies degrading skills, and generates improvement recommendations. Requires session-scan data in metrics.jsonl.

252 17

Explore

oliver-kriska/claude-elixir-phoenix

session-trends

Analyze trends across session metrics. Computes windowed aggregates, deltas, and compares against MEMORY.md findings. Use periodically for progress tracking.

252 17

Explore

oliver-kriska/claude-elixir-phoenix

cc-changelog

CONTRIBUTOR TOOL - Track CC changelog, extract new versions since last check, analyze impact on plugin (breaking changes, opportunities, deprecations). Run periodically or before releases. NOT part of the distributed plugin.

252 17

Explore

oliver-kriska/claude-elixir-phoenix

session-scan

Compute metrics for Claude Code sessions. Discovers via ccrider, filters trivial, computes friction/opportunity/fingerprint scores. Use for broad session triage.

252 17

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Plugin Development Workflow

Before You Start

Scoring Individual Files (CLI)

When Editing Skills (plugins/elixir-phoenix/skills/*/SKILL.md)

When Editing Agents (plugins/elixir-phoenix/agents/*.md)

When Editing Eval Framework (lab/eval/*.py)

When Editing Hooks (plugins/elixir-phoenix/hooks/scripts/*.sh)

Autoresearch (Self-Improvement Loop)

Pre-Commit Checklist

References

Recommended Agent Skills

lab:autoresearch

promote

skill-monitor

session-trends

cc-changelog

session-scan