Agent skill

dqmc-run

Execute DQMC Monte Carlo sweeps on HDF5 simulation files. Use when running simulations, checkpointing long runs, using the queue system for multiple files, or checking simulation status.

Stars 163
Forks 31

Install this agent skill to your Project

npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/development/dqmc-run

SKILL.md

Run Simulations

Inputs

  • HDF5 simulation file(s) from the dqmc-generate and dqmc-parameter-scans skills
  • Binary at build/dqmc, worker utility from dqmc-util worker

Outputs

  • Updated HDF5 files with measurements in /meas_eqlt/ (and /meas_uneqlt/ if enabled)

Decision Rules (single vs queue)

  • Exactly 1 file: run it directly with build/dqmc sim_0.h5.
  • 2+ files: use the queue/worker system (dqmc-util enqueue + dqmc-util worker) even on a local workstation.
    • This avoids ad-hoc shell loops, makes restarts/checkpointing consistent, and scales to parallel execution safely.

Procedure

Single file:

bash
build/dqmc sim_0.h5

With checkpointing (long runs):

bash
build/dqmc -s 300 -t 3600 sim_0.h5  # checkpoint every 300 seconds, max runtime 3600 seconds

Multiple files (queue system):

bash
# 1. Add files to queue/ (globs are expanded by Python, so quote them)
dqmc-util enqueue queue 'data/run/bin_*.h5'

# 2. Start workers
dqmc-util worker queue ./build/dqmc -n 8 -s 300 -t 3600  # 8 parallel workers

Multiple files (local temperature scan example):

bash
# enqueue all bins across temperatures
dqmc-util enqueue queue 'runs/pi_pi_U8_6x6/T*/bin_*.h5'

# run N workers in parallel (pick N for your machine)
dqmc-util worker queue ./build/dqmc -n 8 -s 300 -t 3600

Validation (single file)

bash
dqmc-util summary sim_0.h5

For multiple files:

bash
dqmc-util print-n data/run/   # directory paths should end with '/'

Failure Modes

Symptom Cause Recovery
"opening... failed" File doesn't exist Check path
NaN in output Numerical instability Reduce dt and/or n_matmul. Check parameters
Killed by scheduler Exceeded time limit Use -t flag, checkpoint enabled
Large errorbars Sign problem Add more sweeps and/or bins

Do Not

  • Kill running simulations without checkpointing (-s flag)
  • Run multiple workers on the same file
  • Use manual shell loops for 2+ files (prefer dqmc-util enqueue + dqmc-util worker)

Didn't find tool you were looking for?

Be as detailed as possible for better results