Agent skills
simulation-validator

Agent skill

simulation-validator

Validate simulations across three stages — run pre-flight checks on configuration files (parameter ranges, required fields, disk space), monitor runtime logs for residual growth, NaN/Inf, and adaptive dt collapse, and perform post-flight validation of results (physical bounds, mass/energy conservation, convergence). Diagnose failed simulations with probable-cause analysis and recommended fixes. Use when preparing to launch a simulation, checking whether a running job is healthy, verifying that finished results are trustworthy, or debugging a crash or blow-up, even if the user only says "my simulation crashed" or "can I trust these results."

View SKILL.md on GitHub Repository

Stars 29

Forks 2

Install this agent skill to your Project

npx add-skill https://github.com/HeshamFS/materials-simulation-skills/tree/main/skills/simulation-workflow/simulation-validator

Metadata

Additional technical details for this skill

author: HeshamFS
version: 1.1.0
eval cases: 2
tested with: [ "claude-code", "gemini-cli", "vs-code-copilot" ]
last reviewed: 2026-03-26
security tier: high
security reviewed: YES

SKILL.md

Simulation Validator

Goal

Provide a three-stage validation protocol: pre-flight checks, runtime monitoring, and post-flight validation for materials simulations.

Requirements

Python 3.8+
No external dependencies (uses Python standard library only)
Works on Linux, macOS, and Windows

Inputs to Gather

Before running validation scripts, collect from the user:

Input	Description	Example
Config file	Simulation configuration (JSON/YAML)	`simulation.json`
Log file	Runtime output log	`simulation.log`
Metrics file	Post-run metrics (JSON)	`results.json`
Required params	Parameters that must exist	`dt,dx,kappa`
Valid ranges	Parameter bounds	`dt:1e-6:1e-2`

Decision Guidance

When to Run Each Stage

Is simulation about to start?
├── YES → Run Stage 1: preflight_checker.py
│         └── BLOCK status? → Fix issues, do NOT run simulation
│         └── WARN status? → Review warnings, document if accepted
│         └── PASS status? → Proceed to run simulation
│
Is simulation running?
├── YES → Run Stage 2: runtime_monitor.py (periodically)
│         └── Alerts? → Consider stopping, check parameters
│
Has simulation finished?
├── YES → Run Stage 3: result_validator.py
│         └── Failed checks? → Do NOT use results
│                            → Run failure_diagnoser.py
│         └── All passed? → Results are valid

Choosing Validation Thresholds

Metric	Conservative	Standard	Relaxed
Mass tolerance	1e-6	1e-3	1e-2
Residual growth	2x	10x	100x
dt reduction	10x	100x	1000x

Script Outputs (JSON Fields)

Script	Output Fields
`scripts/preflight_checker.py`	`report.status`, `report.blockers`, `report.warnings`
`scripts/runtime_monitor.py`	`alerts`, `residual_stats`, `dt_stats`
`scripts/result_validator.py`	`checks`, `confidence_score`, `failed_checks`
`scripts/failure_diagnoser.py`	`probable_causes`, `recommended_fixes`

Three-Stage Validation Protocol

Stage 1: Pre-flight (Before Simulation)

Run scripts/preflight_checker.py --config simulation.json
BLOCK status: Stop immediately, fix all blocker issues
WARN status: Review warnings, document accepted risks
PASS status: Proceed to simulation

bash

python3 scripts/preflight_checker.py \
    --config simulation.json \
    --required dt,dx,kappa \
    --ranges "dt:1e-6:1e-2,dx:1e-4:1e-1" \
    --min-free-gb 1.0 \
    --json

Stage 2: Runtime (During Simulation)

Run scripts/runtime_monitor.py --log simulation.log periodically
Configure alert thresholds based on problem type
Stop simulation if critical alerts appear

bash

python3 scripts/runtime_monitor.py \
    --log simulation.log \
    --residual-growth 10.0 \
    --dt-drop 100.0 \
    --json

Stage 3: Post-flight (After Simulation)

Run scripts/result_validator.py --metrics results.json
All checks PASS: Results are valid for analysis
Any check FAIL: Do NOT use results, diagnose failure

bash

python3 scripts/result_validator.py \
    --metrics results.json \
    --bound-min 0.0 \
    --bound-max 1.0 \
    --mass-tol 1e-3 \
    --json

Failure Diagnosis

When validation fails:

bash

python3 scripts/failure_diagnoser.py --log simulation.log --json

Conversational Workflow Example

User: My phase field simulation crashed after 1000 steps. Can you help me figure out why?

Agent workflow:

First, check the log for obvious errors:

bash

python3 scripts/failure_diagnoser.py --log simulation.log --json

If diagnosis suggests numerical blow-up, check runtime stats:
bash
```
python3 scripts/runtime_monitor.py --log simulation.log --json
```
Recommend fixes based on findings:
- If residual grew rapidly → reduce time step
- If dt collapsed → check stability conditions
- If NaN detected → check initial conditions

Error Handling

Error	Cause	Resolution
`Config not found`	File path invalid	Verify config path exists
`Non-numeric value`	Parameter is not a number	Fix config file format
`out of range`	Parameter outside bounds	Adjust parameter or bounds
`Output directory not writable`	Permission issue	Check directory permissions
`Insufficient disk space`	Disk nearly full	Free up space or reduce output

Interpretation Guidance

Status Meanings

Status	Meaning	Action
PASS	All checks passed	Proceed with confidence
WARN	Non-critical issues found	Review and document
BLOCK	Critical issues found	Must fix before proceeding

Confidence Score Interpretation

Score	Meaning
1.0	All validation checks passed
0.75+	Most checks passed, minor issues
0.5-0.75	Significant issues, review carefully
< 0.5	Major problems, do not trust results

Common Failure Patterns

Pattern in Log	Likely Cause	Recommended Fix
NaN, Inf, overflow	Numerical instability	Reduce dt, increase damping
max iterations, did not converge	Solver failure	Tune preconditioner, tolerances
out of memory	Memory exhaustion	Reduce mesh, enable out-of-core
dt reduced	Adaptive stepping triggered	May be okay if controlled

Security

Input Validation

Config file paths are validated for existence before parsing; non-existent paths produce clear errors
--required parameter names are validated against a safe-character allowlist
--ranges entries are parsed as name:min:max with finite numeric bounds enforced
--min-free-gb is validated as a finite positive number
--residual-growth and --dt-drop thresholds are validated as finite positive numbers
--bound-min, --bound-max, and --mass-tol are validated as finite numbers with bound-max > bound-min

File Access

preflight_checker.py reads a single user-specified config file (JSON/YAML) and checks disk space on the output directory
runtime_monitor.py reads a single log file specified by --log; log files are size-limited (500 MB max) before parsing
result_validator.py reads a single metrics file (JSON) specified by --metrics
failure_diagnoser.py reads a single log file specified by --log
No scripts write to the filesystem; all output goes to stdout

Tool Restrictions

Read: Used to inspect script source, references, config files, and simulation logs
Bash: Used to execute the four Python validation scripts (preflight_checker.py, runtime_monitor.py, result_validator.py, failure_diagnoser.py) with explicit argument lists
Write: Used to save validation reports; writes are scoped to the user's working directory
Grep/Glob: Used to locate log files, config files, and search references

Safety Measures

No eval(), exec(), or dynamic code generation
All subprocess calls use explicit argument lists (no shell=True)
Log parsing uses pre-compiled regex patterns; user-supplied patterns are not accepted (patterns are hardcoded)
Phase names and diagnostic strings extracted from logs are sanitized (truncated, control characters stripped) before inclusion in output

Limitations

Not a real-time monitor: Scripts analyze logs after-the-fact
Regex-based: Log parsing depends on pattern matching; may miss unusual formats
No automatic fixes: Scripts diagnose but don't modify simulations

References

references/validation_protocol.md - Detailed checklist and criteria
references/log_patterns.md - Common failure signatures and regex patterns

Version History

v1.1.0 (2024-12-24): Enhanced documentation, decision guidance, Windows compatibility
v1.0.0: Initial release with 4 validation scripts

Maintainer

HeshamFS Core maintainer

Source details

Full Name: HeshamFS/materials-simulation-skills
Branch: main
Path in repo: skills/simulation-workflow/simulation-validator
License: Apache License 2.0
Topics: agent-skills cli-tools skills agents llm materials-science computational-science numerical-methods simulation

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

HeshamFS/materials-simulation-skills

post-processing

Extract, analyze, and summarize simulation output data — pull spatial fields at specific timesteps, compute time-series trends and detect steady state, extract line profiles through the domain, generate statistical summaries and distributions, calculate derived quantities (gradients, fluxes, volume fractions, interface area), compare results against analytical solutions or experimental data, and produce automated analysis reports. Use when interpreting finished simulation results, checking mass or energy conservation, comparing two runs or meshes, extracting interface profiles from phase-field output, or preparing publication-quality analysis, even if the user only says "what do my results look like" or "did my simulation reach steady state."

29 2

Explore

HeshamFS/materials-simulation-skills

performance-profiling

Identify computational bottlenecks, analyze parallel scaling, estimate memory requirements, and generate optimization recommendations for materials simulations — parse timing logs to find dominant phases (solver, assembly, I/O), evaluate strong and weak scaling efficiency, profile memory from mesh and field parameters, and detect bottlenecks with actionable fix suggestions. Use when a simulation is running slower than expected, investigating MPI scaling efficiency, planning HPC resource allocation, deciding whether to tune the preconditioner or reduce I/O frequency, or estimating if a problem fits in available RAM, even if the user only says "my simulation is too slow" or "how many nodes do I need."

29 2

Explore

HeshamFS/materials-simulation-skills

parameter-optimization

Explore and optimize simulation parameters via design of experiments (DOE), sensitivity analysis, and optimizer selection — generate Latin Hypercube, quasi-random, or factorial sample plans, rank parameter influence with sensitivity scores, recommend Bayesian optimization, CMA-ES, or gradient- based methods based on dimension and budget, and fit surrogate models for expensive evaluations. Use when calibrating material properties against experimental data, planning a parameter sweep, performing uncertainty quantification, or choosing an optimization strategy for a simulation with a limited evaluation budget, even if the user only says "which parameters matter most" or "how do I calibrate my model."

29 2

Explore

HeshamFS/materials-simulation-skills

simulation-orchestrator

Orchestrate multi-simulation campaigns — generate parameter sweep configurations (grid, linspace, or Latin Hypercube sampling), initialize and track batch job campaigns, monitor job completion status, and aggregate results with summary statistics across all runs. Use when running a parameter study across dt, kappa, or other simulation inputs, managing dozens or hundreds of simulation configurations, combining outputs from completed batch runs to find the best result, or automating the generate-run-collect workflow for systematic studies, even if the user only says "I need to try many parameter combinations" or "how do I organize a sweep."

29 2

Explore

HeshamFS/materials-simulation-skills

ontology-explorer

Parse, navigate, and query materials science ontology structures — browse class hierarchies, inspect individual classes and their properties, look up object and data property definitions with domain/range, search for ontology terms by keyword, and parse or summarize raw OWL/XML files. Supports the OCDO ecosystem (CMSO, ASMO, CDCO, PODO, PLDO, LDO). Use when exploring what classes or properties an ontology provides, finding the right CMSO term for a crystal structure or simulation concept, understanding parent-child class relationships, or onboarding to an unfamiliar materials ontology, even if the user only says "what ontology terms describe my FCC copper simulation" or "show me the CMSO class hierarchy."

29 2

Explore

HeshamFS/materials-simulation-skills

ontology-mapper

Map materials science terms, crystal structures, and sample descriptions to standardized ontology classes and properties — resolve natural-language concepts to ontology entries with confidence scores, translate Bravais lattice types, space groups, and lattice constants into ontology-compliant annotations, and produce full sample metadata from structured descriptions. Supports any ontology in ontology_registry.json (CMSO, ASMO, etc.). Use when annotating simulation inputs with FAIR metadata, translating "BCC iron" or "FCC copper" into formal ontology terms, preparing machine- readable sample descriptions, or bridging between lab vocabulary and ontology vocabulary, even if the user only says "what CMSO terms describe my material" or "annotate this sample for me."

29 2

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

Metadata

SKILL.md

Simulation Validator

Goal

Requirements

Inputs to Gather

Decision Guidance

When to Run Each Stage

Choosing Validation Thresholds

Script Outputs (JSON Fields)

Three-Stage Validation Protocol

Stage 1: Pre-flight (Before Simulation)

Stage 2: Runtime (During Simulation)

Stage 3: Post-flight (After Simulation)

Failure Diagnosis

Conversational Workflow Example

Error Handling

Interpretation Guidance

Status Meanings

Confidence Score Interpretation

Common Failure Patterns

Security

Input Validation

File Access

Tool Restrictions

Safety Measures

Limitations

References

Version History

Recommended Agent Skills

post-processing

performance-profiling

parameter-optimization

simulation-orchestrator

ontology-explorer

ontology-mapper