Agent skill
policyengine-uk-data
UK survey data enhancement - FRS with WAS imputation patterns and cross-repo variable workflows. Triggers: "FRS", "Family Resources Survey", "WAS", "Wealth and Assets Survey", "UK data", "UK microdata", "wealth imputation", "policyengine-uk-data"
Install this agent skill to your Project
npx add-skill https://github.com/PolicyEngine/policyengine-claude/tree/main/skills/data-science/policyengine-uk-data-skill
SKILL.md
PolicyEngine UK Data
PolicyEngine UK Data provides enhanced Family Resources Survey (FRS) datasets with imputed variables from the Wealth and Assets Survey (WAS).
For Users
What is policyengine-uk-data?
PolicyEngine UK uses the Family Resources Survey (FRS) as its primary microdata source. The FRS contains household demographics, income, and benefits but lacks detailed wealth information. The Wealth and Assets Survey (WAS) provides comprehensive wealth data but has a smaller sample. This package imputes wealth variables from WAS to FRS.
Key datasets:
- FRS (Family Resources Survey): Main UK household survey with ~20,000 households
- WAS (Wealth and Assets Survey): Detailed wealth survey with ~20,000 households
- Enhanced FRS: FRS with imputed wealth variables from WAS
For Analysts
Repository
Location: PolicyEngine/policyengine-uk-data
Clone:
git clone https://github.com/PolicyEngine/policyengine-uk-data
cd policyengine-uk-data
Structure
policyengine_uk_data/
├── datasets/ # Dataset definitions
│ └── frs/ # FRS enhancement
│ ├── raw_frs.py # Raw FRS loader
│ ├── calibration.py # Weight calibration
│ └── imputations/ # Variable imputation
│ ├── wealth.py # WAS wealth imputation
│ ├── student_loans.py # Student loan balances
│ └── ...
└── storage/ # Data storage utilities
Installation
From PyPI:
uv pip install policyengine-uk-data
Development:
uv pip install -e .
For Contributors
Imputation Pattern
The standard pattern for adding WAS-to-FRS imputations:
1. Identify the variables:
- Source: WAS variables (complete wealth data)
- Target: FRS (needs these variables)
- Common variables: Demographics that exist in both surveys
2. Follow the wealth.py pattern:
# In policyengine_uk_data/datasets/frs/imputations/my_variable.py
from policyengine_uk_data.datasets.frs.imputations.imputation_utils import (
impute_from_was
)
def add_my_variable(frs, was):
"""
Impute my_variable from WAS to FRS.
Args:
frs: Enhanced FRS DataFrame
was: WAS DataFrame with target variable
Returns:
Enhanced FRS with imputed variable
"""
return impute_from_was(
donor=was,
recipient=frs,
target_variable='my_variable',
common_variables=[
'age',
'region',
'employment_status',
# Add relevant predictors
],
method='quantile_forest' # Or other microimpute method
)
3. Update the RENAMES dictionary:
If the variable has different names in WAS vs FRS:
# In the relevant module
RENAMES = {
"was_variable_name": "standardized_name",
"frs_variable_name": "standardized_name",
}
4. Add to the pipeline:
Register the imputation in the FRS enhancement pipeline so it runs automatically.
Example: Student Loan Imputation
The recent PR #252 added student loan balance imputation:
# policyengine_uk_data/datasets/frs/imputations/student_loans.py
def add_student_loan_balance(frs, was):
"""
Impute student loan balances from WAS to FRS.
WAS contains:
- total_loans: All loan balances
- total_loans_exc_slc: Loans excluding student loans
Derived variable:
- student_loan_balance = total_loans - total_loans_exc_slc
"""
return impute_from_was(
donor=was,
recipient=frs,
target_variable='student_loan_balance',
common_variables=[
'age',
'highest_qualification',
'region',
'employment_status',
'income'
],
method='quantile_forest'
)
Common Variables for WAS-FRS Imputation
Demographics (always available):
- age
- sex
- region (UK region codes)
Economic status:
- employment_status
- income (or income bands)
- hours_worked
Household:
- household_size
- num_children
- tenure_type (own/rent)
Education:
- highest_qualification
- currently_studying
Testing
Run tests:
make test
# Or pytest directly
pytest policyengine_uk_data/tests/ -v
Test structure:
# Check if imputation was added
pytest policyengine_uk_data/tests/test_imputations.py::test_student_loan_imputation
Validation
After adding an imputation, validate:
1. Distribution check:
# Compare imputed FRS distribution to WAS source
import matplotlib.pyplot as plt
fig, (ax1, ax2) = plt.subplots(1, 2)
ax1.hist(was['my_variable'], bins=50)
ax1.set_title('WAS (source)')
ax2.hist(frs_imputed['my_variable'], bins=50)
ax2.set_title('FRS (imputed)')
2. Aggregate totals:
# Check population-weighted totals match administrative data
weighted_total = (frs_imputed['my_variable'] * frs_imputed['weight']).sum()
print(f"Imputed total: {weighted_total:,.0f}")
# Compare to known UK aggregate
3. Conditional relationships:
# Verify relationships are preserved
# E.g., student loan balance by age and qualification
frs_imputed.groupby(['age_band', 'qualification'])['student_loan_balance'].mean()
Common Patterns
Pattern 1: Simple Variable Imputation
# Most common: direct variable imputation
def add_variable(frs, was):
return impute_from_was(
donor=was,
recipient=frs,
target_variable='my_var',
common_variables=['age', 'income', 'region']
)
Pattern 2: Derived Variable Imputation
# When WAS has components but not the exact variable
def add_derived_variable(frs, was):
# First derive the variable in WAS
was['net_wealth'] = was['total_assets'] - was['total_debts']
# Then impute
return impute_from_was(
donor=was,
recipient=frs,
target_variable='net_wealth',
common_variables=['age', 'income', 'region']
)
Pattern 3: Multiple Related Variables
# Impute several related variables together
def add_wealth_components(frs, was):
variables = [
'property_wealth',
'financial_wealth',
'pension_wealth',
'debt'
]
for var in variables:
frs = impute_from_was(
donor=was,
recipient=frs,
target_variable=var,
common_variables=['age', 'income', 'region']
)
return frs
Integration with PolicyEngine UK
Usage flow:
1. Load raw FRS
↓
2. Add WAS imputations (wealth, student loans, etc.)
↓
3. Calibrate weights to administrative benchmarks
↓
4. Validate against known UK totals
↓
5. Package for policyengine-uk
↓
6. Use for UK policy simulations
In policyengine-uk:
from policyengine_uk import Microsimulation
# Uses enhanced FRS under the hood
sim = Microsimulation()
sim.calculate('student_loan_repayment', period='2026')
# Uses imputed student_loan_balance variable
Related Skills
- microimpute-skill - ML imputation methods (underlying technique)
- policyengine-uk-skill - UK policy model (uses this data)
- microcalibrate-skill - Weight calibration (next step after imputation)
- microdf-skill - Working with survey microdata
Resources
Repository: https://github.com/PolicyEngine/policyengine-uk-data Dependencies: policyengine-uk, policyengine-core, microdf, microimpute Data sources:
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
policyengine-healthcare
Healthcare program modeling in PolicyEngine-US — Medicaid, ACA marketplace, CHIP, and Medicare. Covers encoding rules, running analyses, and navigating the unique complexity of US healthcare programs. Triggers: "healthcare", "health insurance", "Medicaid", "ACA", "CHIP", "Medicare", "marketplace", "premium tax credit", "APTC", "PTC", "SLCSP", "benchmark plan", "rating area", "age curve", "family tier", "coverage gap", "Medicaid expansion", "MAGI", "medicaid_magi", "aca_magi", "medicaid_income_level", "medicaid_category", "enrollment", "takeup", "take-up", "per capita", "CSR", "cost sharing", "insurance premium", "second lowest silver", "required contribution percentage", "42 CFR", "IRC 36B", "categorical eligibility", "expansion adult", "healthcare reform", "healthcare analysis", "health policy".
policyengine-us
ALWAYS LOAD THIS SKILL FIRST before writing any PolicyEngine-US code. Contains the correct API patterns for household calculations and population simulations using the new policyengine package. Covers US federal and state taxes/benefits. Triggers: "what would", "how much would a", "benefit be", "eligible for", "qualify for", "single parent", "married couple", "family of", "household of", "if they earn", "earning $", "making $", "calculate benefits", "calculate taxes", "benefit for a", "what would I get", "what is the maximum", "what is the rate", "poverty line", "income limit", "benefit amount", "maximum benefit", "compare states", "TANF", "SNAP", "EITC", "CTC", "SSI", "WIC", "Section 8", "Medicaid", "ACA", "child tax credit", "earned income", "supplemental security", "housing voucher", "microsimulation", "population", "reform", "policy impact", "budgetary", "decile".
policyengine-uk
ALWAYS LOAD THIS SKILL FIRST before writing any PolicyEngine-UK code. Contains the correct API patterns for household calculations and population simulations using the new policyengine package (not policyengine_uk directly). Triggers: "what would", "how much would a", "benefit be", "eligible for", "qualify for", "single parent", "married couple", "family of", "household of", "if they earn", "with income of", "earning £", "making £", "calculate benefits", "calculate taxes", "benefit for a", "tax for a", "what would I get", "what would they get", "what is the rate", "what is the threshold", "personal allowance", "maximum benefit", "income limit", "benefit amount", "how much is", "Universal Credit", "child benefit", "pension credit", "housing benefit", "council tax", "income tax", "national insurance", "JSA", "ESA", "PIP", "disability living allowance", "working tax credit", "child tax credit", "Scotland", "Wales", "UK", "microsimulation", "population", "reform", "policy impact", "budgetary", "decile".
policyengine-canada
ALWAYS LOAD THIS SKILL FIRST before writing any PolicyEngine-Canada code. Contains Canadian federal and provincial tax/benefit rules for household calculations. IMPORTANT: PolicyEngine-Canada does NOT have representative population microdata. Do NOT attempt microsimulation or population-level estimates for Canada. Only provide household-level analysis (single-family impacts, eligibility, benefit amounts). Triggers: "what would", "how much would a", "benefit be", "eligible for", "qualify for", "single parent", "married couple", "family of", "household of", "if they earn", "earning $", "making $", "calculate benefits", "calculate taxes", "benefit for a", "what would I get", "what is the maximum", "what is the rate", "income limit", "benefit amount", "maximum benefit", "compare provinces", "CCB", "Canada Child Benefit", "GST credit", "HST credit", "GST/HST", "OAS", "Old Age Security", "GIS", "Guaranteed Income Supplement", "CWB", "Canada Workers Benefit", "EI", "Employment Insurance", "CPP", "Canada Pension Plan", "RRSP", "TFSA", "Ontario Child Benefit", "OCB", "Ontario Trillium Benefit", "OTB", "BC Climate Action", "Alberta Child Benefit", "Quebec", "CRA", "Canada Revenue Agency", "Canadian", "Canada", "Ontario", "British Columbia", "Alberta", "Saskatchewan", "Manitoba", "Nova Scotia", "New Brunswick", "PEI", "Newfoundland", "Yukon", "NWT", "Nunavut", "provincial tax", "federal tax Canada".
policyengine-ui-kit-consumer
This skill should be used when setting up a new project that uses @policyengine/ui-kit, debugging CSS or styling issues in a consumer app, or when Tailwind utility classes are not being generated. Also use when creating globals.css, configuring PostCSS, or troubleshooting "no styles", "no spacing", or "no layout" problems. Triggers: "ui-kit import", "globals.css setup", "Tailwind not working", "styles not applying", "utility classes missing", "setup ui-kit", "PostCSS config", "no styling", "CSS broken", "import ui-kit", "theme.css", "no layout", "no spacing", "@tailwindcss/postcss"
policyengine-tailwind-shadcn
Tailwind CSS v4 + shadcn/ui integration patterns for PolicyEngine frontend projects. Covers @theme namespaces, CSS variable conventions, SVG var() usage, and common mistakes. Triggers: "Tailwind v4", "@theme", "shadcn", "CSS variables", "design tokens CSS", "theme.css", "@theme inline"
Didn't find tool you were looking for?