Agent skill
weaver-regression-test
Run the full Weaver provisioner regression test suite against the production service. Covers full_ft, LoRA, debug modes, and edge cases.
Install this agent skill to your Project
npx add-skill https://github.com/nex-agi/weaver/tree/main/.claude/skills/weaver-regression-test
SKILL.md
Weaver Provisioner Regression Testing Skill
Overview
Run end-to-end regression tests for the Weaver provisioner following the test plan in .claude/skills/weaver-regression-test/regression_test_plan.md. Tests cover full fine-tuning, LoRA, debug modes (auto/manual), and edge cases.
Prerequisites
conda activate weaver
export WEAVER_API_KEY="<your-key>"
Instructions
-
Follow
regression_test_plan.mdstrictly (co-located in this skill directory). Execute every test case as documented. -
Use the production service at
https://weaver-console.nex-agi.cn— do not build or run locally. -
IAM errors can be retried without recording. If an IAM authentication error occurs, simply retry the request.
-
Use shared knowledge skills for log analysis. For each test, use the skills from
china-qijizhifeng/nex-taas-shared-knowledgeto:- Query Infrawaves for pod status (by
model_id) - Query Volcengine TLS for provisioner logs
- Cross-reference with
china-qijizhifeng/weaver-serversource code for root cause analysis
- Query Infrawaves for pod status (by
-
Run 3 rounds of testing. Create a GitHub issue in
china-qijizhifeng/weaver-serverand post each round's results as a comment:- Pass: Record as passed
- Fail: Analyze root cause using log skills, record the analysis, and create a sub-issue in
china-qijizhifeng/weaver-serverlinked to the main test issue - Flaky (passes on retry): Still record the failure, including log analysis from the failed attempt
- Blocking failure: Skip the test, move on to the next one, but do not continue multi-round iteration for that test
- Between rounds: Verify all tasks have stopped, then wait 10 minutes before starting the next round
-
Post a final summary report to the issue after all rounds complete.
-
Use example scripts from the
examples/directory:examples/pig_latin_fullft.py— Full fine-tuning (F1-F4)examples/pig_latin_lora.py— LoRA training (L1-L4)examples/pig_latin_fullft_debug_auto.py— Debug auto mode (DA1-DA4)examples/pig_latin_fullft_debug_manual.py— Debug manual mode (DM1-DM3)examples/pig_latin_lora_alt.py— LoRA with alternate base model (L4)
Test Groups
| Group | Tests | What it validates |
|---|---|---|
| Full FT Basic | F1, F2 | Single full_ft provision, independent pods, auto-termination |
| Full FT Concurrent | F3, F4 | 2-3 concurrent full_ft with full isolation |
| LoRA | L1-L4 | LoRA provision, shared trainer dedup, different base_model independence |
| Debug Auto | DA1-DA4 | Debug auto provision, pod reuse, crash recovery, concurrent dedup |
| Debug Manual | DM1-DM3 | Lazy provision on forward_backward, sleep infinity, manual torchrun |
| Edge Cases | E1-E4 | Pod crash recovery, invalid model errors, mixed mode concurrency |
Log Analysis
When a test fails, follow this troubleshooting order:
- Check script logs in
/tmp/test_*.log - Query pod status via Infrawaves skill (by
model_id) - Query provisioner logs via Volcengine TLS skill with keywords:
provisioning,terminate,skip,debug,stale,lora dedup - Cross-reference with weaver-server code:
- Auto-provision:
internal/services/instance_orchestrator.go - LoRA dedup:
checkExistingLoRATrainer() - Debug mode:
extractDebugMode(),provisionNewTrainer() - Terminate:
HandleTerminate()
- Auto-provision:
GitHub Issue Format
# Weaver Provisioner Integration Test — Round N
| Test | Result | Notes |
|------|--------|-------|
| F1 | PASS | |
| F2 | PASS | |
| ... | ... | ... |
## Details
[Per-test details for any failures or notable observations]
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
address-pr-comments
Address GitHub PR review comments. Navigate to the correct worktree, make fixes, push updates.
git-commit
Complete git commit workflow in a worktree. Includes review, staging, and message generation.
testing
Run linting and tests for Weaver SDK. Works in any worktree.
fix-issue
Fix a GitHub issue using git worktree for isolation. Fetches issue, creates worktree, plans and implements fix, then creates PR.
github-pr
Create a GitHub PR from a worktree branch. Use after committing changes.
code-review
Review code changes against Weaver SDK project standards before committing. Works in any worktree.
Didn't find tool you were looking for?