Agent skill
stan-fundamentals
Foundational knowledge for writing Stan 2.37 models including program structure, type system, distributions, and best practices. Use when creating or reviewing Stan models.
Stars
163
Forks
31
Install this agent skill to your Project
npx add-skill https://github.com/majiayu000/claude-skill-registry/tree/main/skills/development/stan-fundamentals-choxos-bayesianagent
SKILL.md
Stan Fundamentals
When to Use This Skill
- Writing new Stan models from scratch
- Understanding Stan program structure
- Learning Stan syntax and conventions
- Translating models from other languages to Stan
- Optimizing existing Stan code
Program Structure
Stan models have up to 7 blocks in this exact order:
stan
functions { } // User-defined functions
data { } // Input data declarations
transformed data { } // Data preprocessing
parameters { } // Model parameters
transformed parameters { } // Derived parameters
model { } // Log probability
generated quantities { } // Posterior predictions
All blocks are optional. Empty string is valid (but useless) Stan program.
Type System Quick Reference
Scalars
stan
int n; // Integer
real x; // Real number
complex z; // Complex number
Vectors and Matrices
stan
vector[N] v; // Column vector
row_vector[N] r; // Row vector
matrix[M, N] A; // Matrix
Arrays (Modern Syntax)
stan
array[N] real x; // 1D array of reals
array[M, N] int y; // 2D array of integers
array[J] vector[K] theta; // Array of vectors
Constrained Types
stan
real<lower=0> sigma; // Non-negative
real<lower=0, upper=1> p; // Probability
simplex[K] theta; // Sums to 1
ordered[K] c; // Ascending
corr_matrix[K] Omega; // Correlation
cov_matrix[K] Sigma; // Covariance
cholesky_factor_corr[K] L_Omega; // Cholesky correlation
Key Distributions
Continuous (SD parameterization!)
stan
y ~ normal(mu, sigma); // sigma is SD
y ~ student_t(nu, mu, sigma);
y ~ cauchy(mu, sigma);
y ~ exponential(lambda);
y ~ gamma(alpha, beta);
y ~ beta(a, b);
y ~ lognormal(mu, sigma);
Discrete
stan
y ~ bernoulli(theta);
y ~ binomial(n, theta);
y ~ poisson(lambda);
y ~ neg_binomial_2(mu, phi);
y ~ categorical(theta);
Multivariate
stan
y ~ multi_normal(mu, Sigma); // Sigma is COVARIANCE
y ~ multi_normal_cholesky(mu, L);
y ~ lkj_corr(eta);
Essential Patterns
Vectorization
stan
// GOOD - Efficient
y ~ normal(mu, sigma);
// BAD - Slow
for (n in 1:N) y[n] ~ normal(mu[n], sigma);
Non-Centered Parameterization
stan
parameters {
vector[J] theta_raw;
}
transformed parameters {
vector[J] theta = mu + tau * theta_raw;
}
model {
theta_raw ~ std_normal();
}
Target Syntax
stan
// These are equivalent:
y ~ normal(mu, sigma);
target += normal_lpdf(y | mu, sigma);
Common Priors
stan
// Location parameters
mu ~ normal(0, 10);
// Scale parameters
sigma ~ exponential(1);
sigma ~ cauchy(0, 2.5); // half-Cauchy when sigma > 0
// Probabilities
theta ~ beta(1, 1); // Uniform on (0,1)
// Regression coefficients
beta ~ normal(0, 2.5);
// Correlation matrices
Omega ~ lkj_corr(2); // eta=2 favors identity
R Integration (cmdstanr)
r
library(cmdstanr)
mod <- cmdstan_model("model.stan")
fit <- mod$sample(data = stan_data, chains = 4)
fit$summary()
fit$cmdstan_diagnose()
Bayesian Workflow (Statistical Rethinking)
1. Prior Predictive Check
r
# Simulate from priors before fitting
n_sim <- 1000
prior_alpha <- rnorm(n_sim, 0, 10)
prior_sigma <- rexp(n_sim, 1)
# Plot: do these produce sensible y values?
2. Fit Model
r
fit <- mod$sample(data = stan_data, chains = 4, adapt_delta = 0.95)
3. Diagnostics
r
fit$summary() # Rhat, ESS
fit$cmdstan_diagnose() # Divergences, treedepth
library(bayesplot)
mcmc_rank_hist(fit$draws()) # Ranked traceplots (preferred)
4. Posterior Predictive Check
r
y_rep <- fit$draws("y_rep", format = "matrix")
library(bayesplot)
ppc_dens_overlay(y, y_rep[1:100, ])
5. Model Comparison
r
library(loo)
loo1 <- loo(fit1$draws("log_lik"))
loo2 <- loo(fit2$draws("log_lik"))
loo_compare(loo1, loo2)
link vs sim Pattern
link(): Uncertainty in mu (epistemic)
r
# Posterior of expected value
post <- fit$draws(format = "df")
mu <- post$alpha + post$beta * x_new # Matrix of mu samples
mu_PI <- apply(mu, 2, quantile, c(0.055, 0.945))
sim(): Prediction interval (epistemic + aleatoric)
r
# Includes observation noise
y_sim <- rnorm(n_samples, mu, post$sigma)
y_PI <- apply(y_sim, 2, quantile, c(0.055, 0.945))
Generated Quantities Template
Always include for diagnostics and model comparison:
stan
generated quantities {
vector[N] log_lik; // For LOO/WAIC
array[N] real y_rep; // For posterior predictive checks
for (n in 1:N) {
log_lik[n] = normal_lpdf(y[n] | mu[n], sigma);
y_rep[n] = normal_rng(mu[n], sigma);
}
}
Diagnostic Checklist
- Rhat < 1.01 for all parameters
- ESS_bulk > 400
- ESS_tail > 400
- Zero divergences
- Not hitting max_treedepth
- Prior predictive produces sensible values
- Posterior predictive matches data pattern
Key Differences from BUGS
| Feature | Stan | BUGS/JAGS |
|---|---|---|
| Normal | normal(mu, sigma) SD |
dnorm(mu, tau) precision |
| MVN | multi_normal(mu, Sigma) cov |
dmnorm(mu, Omega) precision |
| Execution | Sequential (order matters) | Declarative (order doesn't matter) |
| Sampling | HMC/NUTS | Gibbs/Metropolis |
Didn't find tool you were looking for?