Agent skill

microdf

Weighted pandas DataFrames for survey microdata analysis - inequality, poverty, and distributional calculations. Triggers: "weighted mean", "Gini", "poverty rate", "inequality", "MicroDataFrame", "MicroSeries", "weighted statistics", "decile", "quintile", "income distribution", "microdf"

View SKILL.md on GitHub Repository

Stars 26

Forks 5

Install this agent skill to your Project

npx add-skill https://github.com/PolicyEngine/policyengine-claude/tree/main/skills/data-science/microdf-skill

SKILL.md

MicroDF

MicroDF provides weighted pandas DataFrames and Series for analyzing survey microdata, with built-in support for inequality and poverty calculations.

For Users

What is MicroDF?

When you see poverty rates, Gini coefficients, or distributional charts in PolicyEngine, those are calculated using MicroDF.

MicroDF powers:

Poverty rate calculations (SPM)
Inequality metrics (Gini coefficient)
Income distribution analysis
Weighted statistics from survey data

Understanding the Metrics

Gini coefficient:

Calculated using MicroDF from weighted income data
Ranges from 0 (perfect equality) to 1 (perfect inequality)
US typically around 0.48

Poverty rates:

Calculated using MicroDF with weighted household data
Compares income to poverty thresholds
Accounts for household composition

Percentiles:

MicroDF calculates weighted percentiles
Shows income distribution (10th, 50th, 90th percentile)

For Analysts

Installation

bash

uv pip install microdf-python

Quick Start

python

import microdf as mdf
import pandas as pd

# Create sample data
df = pd.DataFrame({
    'income': [10000, 20000, 30000, 40000, 50000],
    'weights': [1, 2, 3, 2, 1]
})

# Create MicroDataFrame
mdf_df = mdf.MicroDataFrame(df, weights='weights')

# All operations are weight-aware
print(f"Weighted mean: ${mdf_df.income.mean():,.0f}")
print(f"Gini coefficient: {mdf_df.income.gini():.3f}")

Common Operations

Weighted statistics:

python

mdf_df.income.mean()     # Weighted mean
mdf_df.income.median()   # Weighted median
mdf_df.income.sum()      # Weighted sum
mdf_df.income.std()      # Weighted standard deviation

Inequality metrics:

python

mdf_df.income.gini()     # Gini coefficient
mdf_df.income.top_x_pct_share(10)  # Top 10% share
mdf_df.income.top_x_pct_share(1)   # Top 1% share

Poverty analysis:

python

# Poverty rate (income < threshold)
poverty_rate = mdf_df.poverty_rate(
    income_measure='income',
    threshold=poverty_line
)

# Poverty gap (how far below threshold)
poverty_gap = mdf_df.poverty_gap(
    income_measure='income',
    threshold=poverty_line
)

# Deep poverty (income < 50% of threshold)
deep_poverty_rate = mdf_df.deep_poverty_rate(
    income_measure='income',
    threshold=poverty_line,
    deep_poverty_line=0.5
)

Quantiles:

python

# Deciles
mdf_df.income.decile_values()

# Quintiles
mdf_df.income.quintile_values()

# Custom quantiles
mdf_df.income.quantile(0.25)  # 25th percentile

MicroSeries

python

# Extract a Series with weights
income_series = mdf_df.income  # This is a MicroSeries

# MicroSeries operations
income_series.mean()
income_series.gini()
income_series.percentile(50)

WARNING: .values and .to_numpy() strip weights. These methods now emit a UserWarning because they return plain numpy arrays where operations like .mean() are unweighted. Always use MicroSeries methods directly for weighted calculations:

python

# ❌ WRONG - strips weights, .mean() is unweighted
ms.values.mean()
ms.to_numpy().mean()

# ✅ CORRECT - weighted automatically
ms.mean()

Working with PolicyEngine Results

python

import microdf as mdf
from policyengine_us import Simulation

# Run simulation with axes (multiple households)
situation_with_axes = {...}  # See policyengine-us-skill
sim = Simulation(situation=situation_with_axes)

# Get results as arrays
incomes = sim.calculate("household_net_income", 2026)
weights = sim.calculate("household_weight", 2026)

# Create MicroDataFrame
df = pd.DataFrame({'income': incomes, 'weight': weights})
mdf_df = mdf.MicroDataFrame(df, weights='weight')

# Calculate metrics
gini = mdf_df.income.gini()
poverty_rate = mdf_df.poverty_rate('income', threshold=15000)

print(f"Gini: {gini:.3f}")
print(f"Poverty rate: {poverty_rate:.1%}")

For Contributors

Repository

Location: PolicyEngine/microdf

Clone:

bash

git clone https://github.com/PolicyEngine/microdf
cd microdf

Current Implementation

To see current API:

bash

# Main classes
cat microdf/microframe.py   # MicroDataFrame
cat microdf/microseries.py  # MicroSeries

# Key modules
cat microdf/generic.py      # Generic weighted operations
cat microdf/inequality.py   # Gini, top shares
cat microdf/poverty.py      # Poverty metrics

To see all methods:

bash

# MicroDataFrame methods
grep "def " microdf/microframe.py

# MicroSeries methods
grep "def " microdf/microseries.py

Testing

To see test patterns:

bash

ls tests/
cat tests/test_microframe.py

Run tests:

bash

make test

# Or
pytest tests/ -v

Contributing

Before contributing:

Check if method already exists
Ensure it's weighted correctly
Add tests
Follow policyengine-standards-skill

Common contributions:

New inequality metrics
New poverty measures
Performance optimizations
Bug fixes

Advanced Patterns

Custom Aggregations

python

# Define custom weighted aggregation
def weighted_operation(series, weights):
    return (series * weights).sum() / weights.sum()

# Apply to MicroSeries
result = weighted_operation(mdf_df.income, mdf_df.weights)

Groupby Operations

python

# Group by with weights
grouped = mdf_df.groupby('state')
state_means = grouped.income.mean()  # Weighted means by state

Inequality Decomposition

To see decomposition methods:

bash

grep -A 20 "def.*decomp" microdf/

Integration Examples

Example 1: PolicyEngine Blog Post Analysis

python

# Pattern from PolicyEngine blog posts
import microdf as mdf

# Get simulation results
baseline_income = baseline_sim.calculate("household_net_income", 2026)
reform_income = reform_sim.calculate("household_net_income", 2026)
weights = baseline_sim.calculate("household_weight", 2026)

# Create MicroDataFrame
df = pd.DataFrame({
    'baseline_income': baseline_income,
    'reform_income': reform_income,
    'weight': weights
})
mdf_df = mdf.MicroDataFrame(df, weights='weight')

# Calculate impacts
baseline_gini = mdf_df.baseline_income.gini()
reform_gini = mdf_df.reform_income.gini()

print(f"Gini change: {reform_gini - baseline_gini:+.4f}")

Example 2: Poverty Analysis

python

# Calculate poverty under baseline and reform
from policyengine_us import Simulation

baseline_sim = Simulation(situation=situation)
reform_sim = Simulation(situation=situation, reform=reform)

# Get incomes — use household_weight (the only calibrated weight) mapped to spm_unit
baseline_income = baseline_sim.calculate("spm_unit_net_income", 2026)
reform_income = reform_sim.calculate("spm_unit_net_income", 2026)
spm_threshold = baseline_sim.calculate("spm_unit_poverty_threshold", 2026)
weights = baseline_sim.calculate("household_weight", 2026, map_to="spm_unit")

# Calculate poverty rates
df_baseline = mdf.MicroDataFrame(
    pd.DataFrame({'income': baseline_income, 'threshold': spm_threshold, 'weight': weights}),
    weights='weight'
)

poverty_baseline = (df_baseline.income < df_baseline.threshold).mean()  # Weighted

# Similar for reform
print(f"Poverty reduction: {(poverty_baseline - poverty_reform):.1%}")

Package Status

Maturity: Stable, production-ready API stability: Stable (rarely breaking changes) Performance: Optimized for large datasets

To see version:

bash

pip show microdf-python

To see changelog:

bash

cat CHANGELOG.md  # In microdf repo

Related Skills

policyengine-us-skill - Generating data for microdf analysis
policyengine-analysis-skill - Using microdf in policy analysis
policyengine-us-data-skill - Data sources for microdf

Resources

Repository: https://github.com/PolicyEngine/microdf PyPI: https://pypi.org/project/microdf-python/ Issues: https://github.com/PolicyEngine/microdf/issues

Maintainer

PolicyEngine Core maintainer

Source details

Full Name: PolicyEngine/policyengine-claude
Branch: main
Path in repo: skills/data-science/microdf-skill
License: MIT License

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

PolicyEngine/policyengine-claude

policyengine-healthcare

Healthcare program modeling in PolicyEngine-US — Medicaid, ACA marketplace, CHIP, and Medicare. Covers encoding rules, running analyses, and navigating the unique complexity of US healthcare programs. Triggers: "healthcare", "health insurance", "Medicaid", "ACA", "CHIP", "Medicare", "marketplace", "premium tax credit", "APTC", "PTC", "SLCSP", "benchmark plan", "rating area", "age curve", "family tier", "coverage gap", "Medicaid expansion", "MAGI", "medicaid_magi", "aca_magi", "medicaid_income_level", "medicaid_category", "enrollment", "takeup", "take-up", "per capita", "CSR", "cost sharing", "insurance premium", "second lowest silver", "required contribution percentage", "42 CFR", "IRC 36B", "categorical eligibility", "expansion adult", "healthcare reform", "healthcare analysis", "health policy".

26 5

Explore

PolicyEngine/policyengine-claude

policyengine-us

ALWAYS LOAD THIS SKILL FIRST before writing any PolicyEngine-US code. Contains the correct API patterns for household calculations and population simulations using the new policyengine package. Covers US federal and state taxes/benefits. Triggers: "what would", "how much would a", "benefit be", "eligible for", "qualify for", "single parent", "married couple", "family of", "household of", "if they earn", "earning $", "making $", "calculate benefits", "calculate taxes", "benefit for a", "what would I get", "what is the maximum", "what is the rate", "poverty line", "income limit", "benefit amount", "maximum benefit", "compare states", "TANF", "SNAP", "EITC", "CTC", "SSI", "WIC", "Section 8", "Medicaid", "ACA", "child tax credit", "earned income", "supplemental security", "housing voucher", "microsimulation", "population", "reform", "policy impact", "budgetary", "decile".

26 5

Explore

PolicyEngine/policyengine-claude

policyengine-uk

ALWAYS LOAD THIS SKILL FIRST before writing any PolicyEngine-UK code. Contains the correct API patterns for household calculations and population simulations using the new policyengine package (not policyengine_uk directly). Triggers: "what would", "how much would a", "benefit be", "eligible for", "qualify for", "single parent", "married couple", "family of", "household of", "if they earn", "with income of", "earning £", "making £", "calculate benefits", "calculate taxes", "benefit for a", "tax for a", "what would I get", "what would they get", "what is the rate", "what is the threshold", "personal allowance", "maximum benefit", "income limit", "benefit amount", "how much is", "Universal Credit", "child benefit", "pension credit", "housing benefit", "council tax", "income tax", "national insurance", "JSA", "ESA", "PIP", "disability living allowance", "working tax credit", "child tax credit", "Scotland", "Wales", "UK", "microsimulation", "population", "reform", "policy impact", "budgetary", "decile".

26 5

Explore

PolicyEngine/policyengine-claude

policyengine-canada

ALWAYS LOAD THIS SKILL FIRST before writing any PolicyEngine-Canada code. Contains Canadian federal and provincial tax/benefit rules for household calculations. IMPORTANT: PolicyEngine-Canada does NOT have representative population microdata. Do NOT attempt microsimulation or population-level estimates for Canada. Only provide household-level analysis (single-family impacts, eligibility, benefit amounts). Triggers: "what would", "how much would a", "benefit be", "eligible for", "qualify for", "single parent", "married couple", "family of", "household of", "if they earn", "earning $", "making $", "calculate benefits", "calculate taxes", "benefit for a", "what would I get", "what is the maximum", "what is the rate", "income limit", "benefit amount", "maximum benefit", "compare provinces", "CCB", "Canada Child Benefit", "GST credit", "HST credit", "GST/HST", "OAS", "Old Age Security", "GIS", "Guaranteed Income Supplement", "CWB", "Canada Workers Benefit", "EI", "Employment Insurance", "CPP", "Canada Pension Plan", "RRSP", "TFSA", "Ontario Child Benefit", "OCB", "Ontario Trillium Benefit", "OTB", "BC Climate Action", "Alberta Child Benefit", "Quebec", "CRA", "Canada Revenue Agency", "Canadian", "Canada", "Ontario", "British Columbia", "Alberta", "Saskatchewan", "Manitoba", "Nova Scotia", "New Brunswick", "PEI", "Newfoundland", "Yukon", "NWT", "Nunavut", "provincial tax", "federal tax Canada".

26 5

Explore

PolicyEngine/policyengine-claude

policyengine-ui-kit-consumer

This skill should be used when setting up a new project that uses @policyengine/ui-kit, debugging CSS or styling issues in a consumer app, or when Tailwind utility classes are not being generated. Also use when creating globals.css, configuring PostCSS, or troubleshooting "no styles", "no spacing", or "no layout" problems. Triggers: "ui-kit import", "globals.css setup", "Tailwind not working", "styles not applying", "utility classes missing", "setup ui-kit", "PostCSS config", "no styling", "CSS broken", "import ui-kit", "theme.css", "no layout", "no spacing", "@tailwindcss/postcss"

26 5

Explore

PolicyEngine/policyengine-claude

policyengine-tailwind-shadcn

Tailwind CSS v4 + shadcn/ui integration patterns for PolicyEngine frontend projects. Covers @theme namespaces, CSS variable conventions, SVG var() usage, and common mistakes. Triggers: "Tailwind v4", "@theme", "shadcn", "CSS variables", "design tokens CSS", "theme.css", "@theme inline"

26 5

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

MicroDF

For Users

What is MicroDF?

Understanding the Metrics

For Analysts

Installation

Quick Start

Common Operations

MicroSeries

Working with PolicyEngine Results

For Contributors

Repository

Current Implementation

Testing

Contributing

Advanced Patterns

Custom Aggregations

Groupby Operations

Inequality Decomposition

Integration Examples

Example 1: PolicyEngine Blog Post Analysis

Example 2: Poverty Analysis

Package Status

Related Skills

Resources

Recommended Agent Skills

policyengine-healthcare

policyengine-us

policyengine-uk

policyengine-canada

policyengine-ui-kit-consumer

policyengine-tailwind-shadcn