Agent skill
microdf
Weighted pandas DataFrames for survey microdata analysis - inequality, poverty, and distributional calculations. Triggers: "weighted mean", "Gini", "poverty rate", "inequality", "MicroDataFrame", "MicroSeries", "weighted statistics", "decile", "quintile", "income distribution", "microdf"
Install this agent skill to your Project
npx add-skill https://github.com/PolicyEngine/policyengine-claude/tree/main/skills/data-science/microdf-skill
SKILL.md
MicroDF
MicroDF provides weighted pandas DataFrames and Series for analyzing survey microdata, with built-in support for inequality and poverty calculations.
For Users
What is MicroDF?
When you see poverty rates, Gini coefficients, or distributional charts in PolicyEngine, those are calculated using MicroDF.
MicroDF powers:
- Poverty rate calculations (SPM)
- Inequality metrics (Gini coefficient)
- Income distribution analysis
- Weighted statistics from survey data
Understanding the Metrics
Gini coefficient:
- Calculated using MicroDF from weighted income data
- Ranges from 0 (perfect equality) to 1 (perfect inequality)
- US typically around 0.48
Poverty rates:
- Calculated using MicroDF with weighted household data
- Compares income to poverty thresholds
- Accounts for household composition
Percentiles:
- MicroDF calculates weighted percentiles
- Shows income distribution (10th, 50th, 90th percentile)
For Analysts
Installation
uv pip install microdf-python
Quick Start
import microdf as mdf
import pandas as pd
# Create sample data
df = pd.DataFrame({
'income': [10000, 20000, 30000, 40000, 50000],
'weights': [1, 2, 3, 2, 1]
})
# Create MicroDataFrame
mdf_df = mdf.MicroDataFrame(df, weights='weights')
# All operations are weight-aware
print(f"Weighted mean: ${mdf_df.income.mean():,.0f}")
print(f"Gini coefficient: {mdf_df.income.gini():.3f}")
Common Operations
Weighted statistics:
mdf_df.income.mean() # Weighted mean
mdf_df.income.median() # Weighted median
mdf_df.income.sum() # Weighted sum
mdf_df.income.std() # Weighted standard deviation
Inequality metrics:
mdf_df.income.gini() # Gini coefficient
mdf_df.income.top_x_pct_share(10) # Top 10% share
mdf_df.income.top_x_pct_share(1) # Top 1% share
Poverty analysis:
# Poverty rate (income < threshold)
poverty_rate = mdf_df.poverty_rate(
income_measure='income',
threshold=poverty_line
)
# Poverty gap (how far below threshold)
poverty_gap = mdf_df.poverty_gap(
income_measure='income',
threshold=poverty_line
)
# Deep poverty (income < 50% of threshold)
deep_poverty_rate = mdf_df.deep_poverty_rate(
income_measure='income',
threshold=poverty_line,
deep_poverty_line=0.5
)
Quantiles:
# Deciles
mdf_df.income.decile_values()
# Quintiles
mdf_df.income.quintile_values()
# Custom quantiles
mdf_df.income.quantile(0.25) # 25th percentile
MicroSeries
# Extract a Series with weights
income_series = mdf_df.income # This is a MicroSeries
# MicroSeries operations
income_series.mean()
income_series.gini()
income_series.percentile(50)
WARNING: .values and .to_numpy() strip weights. These methods now emit a UserWarning because they return plain numpy arrays where operations like .mean() are unweighted. Always use MicroSeries methods directly for weighted calculations:
# ❌ WRONG - strips weights, .mean() is unweighted
ms.values.mean()
ms.to_numpy().mean()
# ✅ CORRECT - weighted automatically
ms.mean()
Working with PolicyEngine Results
import microdf as mdf
from policyengine_us import Simulation
# Run simulation with axes (multiple households)
situation_with_axes = {...} # See policyengine-us-skill
sim = Simulation(situation=situation_with_axes)
# Get results as arrays
incomes = sim.calculate("household_net_income", 2026)
weights = sim.calculate("household_weight", 2026)
# Create MicroDataFrame
df = pd.DataFrame({'income': incomes, 'weight': weights})
mdf_df = mdf.MicroDataFrame(df, weights='weight')
# Calculate metrics
gini = mdf_df.income.gini()
poverty_rate = mdf_df.poverty_rate('income', threshold=15000)
print(f"Gini: {gini:.3f}")
print(f"Poverty rate: {poverty_rate:.1%}")
For Contributors
Repository
Location: PolicyEngine/microdf
Clone:
git clone https://github.com/PolicyEngine/microdf
cd microdf
Current Implementation
To see current API:
# Main classes
cat microdf/microframe.py # MicroDataFrame
cat microdf/microseries.py # MicroSeries
# Key modules
cat microdf/generic.py # Generic weighted operations
cat microdf/inequality.py # Gini, top shares
cat microdf/poverty.py # Poverty metrics
To see all methods:
# MicroDataFrame methods
grep "def " microdf/microframe.py
# MicroSeries methods
grep "def " microdf/microseries.py
Testing
To see test patterns:
ls tests/
cat tests/test_microframe.py
Run tests:
make test
# Or
pytest tests/ -v
Contributing
Before contributing:
- Check if method already exists
- Ensure it's weighted correctly
- Add tests
- Follow policyengine-standards-skill
Common contributions:
- New inequality metrics
- New poverty measures
- Performance optimizations
- Bug fixes
Advanced Patterns
Custom Aggregations
# Define custom weighted aggregation
def weighted_operation(series, weights):
return (series * weights).sum() / weights.sum()
# Apply to MicroSeries
result = weighted_operation(mdf_df.income, mdf_df.weights)
Groupby Operations
# Group by with weights
grouped = mdf_df.groupby('state')
state_means = grouped.income.mean() # Weighted means by state
Inequality Decomposition
To see decomposition methods:
grep -A 20 "def.*decomp" microdf/
Integration Examples
Example 1: PolicyEngine Blog Post Analysis
# Pattern from PolicyEngine blog posts
import microdf as mdf
# Get simulation results
baseline_income = baseline_sim.calculate("household_net_income", 2026)
reform_income = reform_sim.calculate("household_net_income", 2026)
weights = baseline_sim.calculate("household_weight", 2026)
# Create MicroDataFrame
df = pd.DataFrame({
'baseline_income': baseline_income,
'reform_income': reform_income,
'weight': weights
})
mdf_df = mdf.MicroDataFrame(df, weights='weight')
# Calculate impacts
baseline_gini = mdf_df.baseline_income.gini()
reform_gini = mdf_df.reform_income.gini()
print(f"Gini change: {reform_gini - baseline_gini:+.4f}")
Example 2: Poverty Analysis
# Calculate poverty under baseline and reform
from policyengine_us import Simulation
baseline_sim = Simulation(situation=situation)
reform_sim = Simulation(situation=situation, reform=reform)
# Get incomes — use household_weight (the only calibrated weight) mapped to spm_unit
baseline_income = baseline_sim.calculate("spm_unit_net_income", 2026)
reform_income = reform_sim.calculate("spm_unit_net_income", 2026)
spm_threshold = baseline_sim.calculate("spm_unit_poverty_threshold", 2026)
weights = baseline_sim.calculate("household_weight", 2026, map_to="spm_unit")
# Calculate poverty rates
df_baseline = mdf.MicroDataFrame(
pd.DataFrame({'income': baseline_income, 'threshold': spm_threshold, 'weight': weights}),
weights='weight'
)
poverty_baseline = (df_baseline.income < df_baseline.threshold).mean() # Weighted
# Similar for reform
print(f"Poverty reduction: {(poverty_baseline - poverty_reform):.1%}")
Package Status
Maturity: Stable, production-ready API stability: Stable (rarely breaking changes) Performance: Optimized for large datasets
To see version:
pip show microdf-python
To see changelog:
cat CHANGELOG.md # In microdf repo
Related Skills
- policyengine-us-skill - Generating data for microdf analysis
- policyengine-analysis-skill - Using microdf in policy analysis
- policyengine-us-data-skill - Data sources for microdf
Resources
Repository: https://github.com/PolicyEngine/microdf PyPI: https://pypi.org/project/microdf-python/ Issues: https://github.com/PolicyEngine/microdf/issues
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
policyengine-healthcare
Healthcare program modeling in PolicyEngine-US — Medicaid, ACA marketplace, CHIP, and Medicare. Covers encoding rules, running analyses, and navigating the unique complexity of US healthcare programs. Triggers: "healthcare", "health insurance", "Medicaid", "ACA", "CHIP", "Medicare", "marketplace", "premium tax credit", "APTC", "PTC", "SLCSP", "benchmark plan", "rating area", "age curve", "family tier", "coverage gap", "Medicaid expansion", "MAGI", "medicaid_magi", "aca_magi", "medicaid_income_level", "medicaid_category", "enrollment", "takeup", "take-up", "per capita", "CSR", "cost sharing", "insurance premium", "second lowest silver", "required contribution percentage", "42 CFR", "IRC 36B", "categorical eligibility", "expansion adult", "healthcare reform", "healthcare analysis", "health policy".
policyengine-us
ALWAYS LOAD THIS SKILL FIRST before writing any PolicyEngine-US code. Contains the correct API patterns for household calculations and population simulations using the new policyengine package. Covers US federal and state taxes/benefits. Triggers: "what would", "how much would a", "benefit be", "eligible for", "qualify for", "single parent", "married couple", "family of", "household of", "if they earn", "earning $", "making $", "calculate benefits", "calculate taxes", "benefit for a", "what would I get", "what is the maximum", "what is the rate", "poverty line", "income limit", "benefit amount", "maximum benefit", "compare states", "TANF", "SNAP", "EITC", "CTC", "SSI", "WIC", "Section 8", "Medicaid", "ACA", "child tax credit", "earned income", "supplemental security", "housing voucher", "microsimulation", "population", "reform", "policy impact", "budgetary", "decile".
policyengine-uk
ALWAYS LOAD THIS SKILL FIRST before writing any PolicyEngine-UK code. Contains the correct API patterns for household calculations and population simulations using the new policyengine package (not policyengine_uk directly). Triggers: "what would", "how much would a", "benefit be", "eligible for", "qualify for", "single parent", "married couple", "family of", "household of", "if they earn", "with income of", "earning £", "making £", "calculate benefits", "calculate taxes", "benefit for a", "tax for a", "what would I get", "what would they get", "what is the rate", "what is the threshold", "personal allowance", "maximum benefit", "income limit", "benefit amount", "how much is", "Universal Credit", "child benefit", "pension credit", "housing benefit", "council tax", "income tax", "national insurance", "JSA", "ESA", "PIP", "disability living allowance", "working tax credit", "child tax credit", "Scotland", "Wales", "UK", "microsimulation", "population", "reform", "policy impact", "budgetary", "decile".
policyengine-canada
ALWAYS LOAD THIS SKILL FIRST before writing any PolicyEngine-Canada code. Contains Canadian federal and provincial tax/benefit rules for household calculations. IMPORTANT: PolicyEngine-Canada does NOT have representative population microdata. Do NOT attempt microsimulation or population-level estimates for Canada. Only provide household-level analysis (single-family impacts, eligibility, benefit amounts). Triggers: "what would", "how much would a", "benefit be", "eligible for", "qualify for", "single parent", "married couple", "family of", "household of", "if they earn", "earning $", "making $", "calculate benefits", "calculate taxes", "benefit for a", "what would I get", "what is the maximum", "what is the rate", "income limit", "benefit amount", "maximum benefit", "compare provinces", "CCB", "Canada Child Benefit", "GST credit", "HST credit", "GST/HST", "OAS", "Old Age Security", "GIS", "Guaranteed Income Supplement", "CWB", "Canada Workers Benefit", "EI", "Employment Insurance", "CPP", "Canada Pension Plan", "RRSP", "TFSA", "Ontario Child Benefit", "OCB", "Ontario Trillium Benefit", "OTB", "BC Climate Action", "Alberta Child Benefit", "Quebec", "CRA", "Canada Revenue Agency", "Canadian", "Canada", "Ontario", "British Columbia", "Alberta", "Saskatchewan", "Manitoba", "Nova Scotia", "New Brunswick", "PEI", "Newfoundland", "Yukon", "NWT", "Nunavut", "provincial tax", "federal tax Canada".
policyengine-ui-kit-consumer
This skill should be used when setting up a new project that uses @policyengine/ui-kit, debugging CSS or styling issues in a consumer app, or when Tailwind utility classes are not being generated. Also use when creating globals.css, configuring PostCSS, or troubleshooting "no styles", "no spacing", or "no layout" problems. Triggers: "ui-kit import", "globals.css setup", "Tailwind not working", "styles not applying", "utility classes missing", "setup ui-kit", "PostCSS config", "no styling", "CSS broken", "import ui-kit", "theme.css", "no layout", "no spacing", "@tailwindcss/postcss"
policyengine-tailwind-shadcn
Tailwind CSS v4 + shadcn/ui integration patterns for PolicyEngine frontend projects. Covers @theme namespaces, CSS variable conventions, SVG var() usage, and common mistakes. Triggers: "Tailwind v4", "@theme", "shadcn", "CSS variables", "design tokens CSS", "theme.css", "@theme inline"
Didn't find tool you were looking for?