Agent skill

domain-advisor

Plan experiments or development tasks using past knowledge. Adapts behavior based on project domain (research, unsloth, cuda) by reading registry.json. Triggers on <advise> or questions like "What should we try next?"

Stars 163
Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/development/unknown-erland366-transformers-playgro

Metadata

Additional technical details for this skill

tags
planning experiments advice
short description
Plan next steps using past knowledge

SKILL.md

Skill: domain-advisor

When to use

Use this skill when:

  • The user message starts with <advise>, or
  • The user asks "What should we try next?" or similar planning questions.

Initialization

  1. Read .codex/skills/registry.json to determine:

    • domain: research | unsloth | cuda
    • paths.reports: where to find experiment/benchmark reports
    • paths.experiment_log: path to experiment log
    • paths.troubleshooting: path to troubleshooting guide
  2. Adapt behavior based on domain (see Domain-Specific Behavior below).

Inputs

  • User goal (including constraints like run budget, hardware, time).
  • Optional references to files (configs, reports, logs) mentioned in the goal.

Behavior

  1. Understand the goal

    • Parse the research/development goal and constraints.
    • If unclear, ask a concise clarifying question.
  2. Gather context

    • Scan .codex/skills/ for relevant result skills matching the task.
    • Read relevant reports from the paths.reports directory.
    • Skim recent entries in paths.experiment_log.
    • If errors are mentioned, check paths.troubleshooting for known patterns.
  3. Propose a plan

    • Design 2–5 concrete experiments/tasks, each with:
      • Clear objective
      • Key configuration/parameters
      • Expected outcome or hypothesis
      • Any relevant variation to test
    • Respect constraints (budget, hardware, time).
    • Reuse defaults from existing skills instead of inventing new ones.
  4. Output format

    • Start with a short natural-language summary.

    • Provide a markdown table:

      id description key_differences notes
    • Optionally propose file paths for configs or reports to create.

    • If errors were mentioned, explain how this plan avoids known failure patterns.

  5. Logging

    • When useful, append a short entry to paths.experiment_log summarizing the proposed plan (only with user approval).

Domain-Specific Behavior

Research Domain

When domain: research:

Focus areas:

  • Hyperparameter sweeps (learning rate, batch size, epochs)
  • Model architecture variations (layers, dimensions, attention heads)
  • Dataset mixtures and sampling strategies
  • Training dynamics (warmup, schedulers, checkpointing)

Context to gather:

  • training_reports/*.md - past training runs
  • Model configs and their performance
  • Loss curves and convergence patterns

Output emphasis:

  • Parameter sweep tables with specific values
  • Ablation study designs
  • Baseline comparisons

Unsloth Domain

When domain: unsloth:

Focus areas:

  • LoRA rank and alpha selection
  • Quantization settings (4-bit, 8-bit, nf4)
  • Gradient checkpointing configuration
  • Fine-tuning hyperparameters for specific model families
  • Memory optimization strategies

Context to gather:

  • training_reports/*.md - past fine-tuning runs
  • Model-specific Unsloth configurations
  • Memory usage patterns

Output emphasis:

  • LoRA configuration recommendations
  • Memory/speed tradeoffs
  • Model-specific settings (Llama, Mistral, Qwen, etc.)

CUDA Domain

When domain: cuda:

Focus areas:

  • Tiling strategies and block sizes
  • Shared memory usage patterns
  • Warp-level primitives
  • Memory coalescing optimization
  • Triton autotuning configurations

Context to gather:

  • benchmark_results/*.md - past kernel benchmarks
  • Profiling data (nsight, ncu reports)
  • Bandwidth and occupancy metrics

Output emphasis:

  • Kernel configuration parameters
  • Expected speedup estimates
  • Memory access pattern recommendations
  • Profiling metrics to track

Example Output

Research Example

markdown
## Experiment Plan: Attention Head Ablation

Based on previous runs in `training_reports/baseline-2025-01.md`, the 8-head
configuration achieved 92% accuracy. Testing whether fewer heads can match
this with lower compute.

| id | description | key_differences | notes |
|----|-------------|-----------------|-------|
| A1 | 4-head attention | heads=4 vs baseline 8 | Test if 4 heads sufficient |
| A2 | 6-head attention | heads=6 | Middle ground |
| A3 | 4-head + wider FFN | heads=4, ffn_dim=4096 | Compensate with FFN |

All runs use: lr=1e-4, batch_size=32, epochs=10 (from baseline config).

Unsloth Example

markdown
## Fine-tuning Plan: Llama-3 8B with Unsloth

Based on `training_reports/llama3-lora-v1.md`, rank=16 showed good results
but OOM'd at batch_size=4. Testing memory-efficient configurations.

| id | description | key_differences | notes |
|----|-------------|-----------------|-------|
| U1 | rank=8 + grad_ckpt | Lower rank, enable checkpointing | Memory baseline |
| U2 | rank=16 + 4bit | Full rank with 4-bit quantization | Quality vs memory |
| U3 | rank=32 + offload | Higher rank with CPU offload | Max quality attempt |

All runs use: alpha=32, dropout=0.05, target_modules=["q_proj", "v_proj"]

CUDA Example

markdown
## Kernel Optimization Plan: Softmax

Based on `benchmark_results/softmax-v1.md`, current implementation achieves
80% of theoretical bandwidth. Testing tiling strategies.

| id | description | key_differences | notes |
|----|-------------|-----------------|-------|
| K1 | 2D tiling | BLOCK_M=64, BLOCK_N=64 | Better L2 reuse |
| K2 | Warp reduction | Use warp shuffles | Reduce shared mem |
| K3 | Online softmax | Single-pass algorithm | Fused with attention |

Profile with: `ncu --set full` to capture memory metrics.

Didn't find tool you were looking for?

Be as detailed as possible for better results