Agent skills
policyengine-uk-data

Agent skill

policyengine-uk-data

UK survey data enhancement - FRS with WAS imputation patterns and cross-repo variable workflows. Triggers: "FRS", "Family Resources Survey", "WAS", "Wealth and Assets Survey", "UK data", "UK microdata", "wealth imputation", "policyengine-uk-data"

View SKILL.md on GitHub Repository

Stars 26

Forks 5

Install this agent skill to your Project

npx add-skill https://github.com/PolicyEngine/policyengine-claude/tree/main/skills/data-science/policyengine-uk-data-skill

SKILL.md

PolicyEngine UK Data

PolicyEngine UK Data provides enhanced Family Resources Survey (FRS) datasets with imputed variables from the Wealth and Assets Survey (WAS).

For Users

What is policyengine-uk-data?

PolicyEngine UK uses the Family Resources Survey (FRS) as its primary microdata source. The FRS contains household demographics, income, and benefits but lacks detailed wealth information. The Wealth and Assets Survey (WAS) provides comprehensive wealth data but has a smaller sample. This package imputes wealth variables from WAS to FRS.

Key datasets:

FRS (Family Resources Survey): Main UK household survey with ~20,000 households
WAS (Wealth and Assets Survey): Detailed wealth survey with ~20,000 households
Enhanced FRS: FRS with imputed wealth variables from WAS

For Analysts

Repository

Location: PolicyEngine/policyengine-uk-data

Clone:

bash

git clone https://github.com/PolicyEngine/policyengine-uk-data
cd policyengine-uk-data

Structure

policyengine_uk_data/
├── datasets/          # Dataset definitions
│   └── frs/          # FRS enhancement
│       ├── raw_frs.py           # Raw FRS loader
│       ├── calibration.py       # Weight calibration
│       └── imputations/         # Variable imputation
│           ├── wealth.py        # WAS wealth imputation
│           ├── student_loans.py # Student loan balances
│           └── ...
└── storage/          # Data storage utilities

Installation

From PyPI:

bash

uv pip install policyengine-uk-data

Development:

bash

uv pip install -e .

For Contributors

Imputation Pattern

The standard pattern for adding WAS-to-FRS imputations:

1. Identify the variables:

Source: WAS variables (complete wealth data)
Target: FRS (needs these variables)
Common variables: Demographics that exist in both surveys

2. Follow the wealth.py pattern:

python

# In policyengine_uk_data/datasets/frs/imputations/my_variable.py

from policyengine_uk_data.datasets.frs.imputations.imputation_utils import (
    impute_from_was
)

def add_my_variable(frs, was):
    """
    Impute my_variable from WAS to FRS.

    Args:
        frs: Enhanced FRS DataFrame
        was: WAS DataFrame with target variable

    Returns:
        Enhanced FRS with imputed variable
    """
    return impute_from_was(
        donor=was,
        recipient=frs,
        target_variable='my_variable',
        common_variables=[
            'age',
            'region',
            'employment_status',
            # Add relevant predictors
        ],
        method='quantile_forest'  # Or other microimpute method
    )

3. Update the RENAMES dictionary:

If the variable has different names in WAS vs FRS:

python

# In the relevant module
RENAMES = {
    "was_variable_name": "standardized_name",
    "frs_variable_name": "standardized_name",
}

4. Add to the pipeline:

Example: Student Loan Imputation

The recent PR #252 added student loan balance imputation:

python

# policyengine_uk_data/datasets/frs/imputations/student_loans.py

def add_student_loan_balance(frs, was):
    """
    Impute student loan balances from WAS to FRS.

    WAS contains:
    - total_loans: All loan balances
    - total_loans_exc_slc: Loans excluding student loans

    Derived variable:
    - student_loan_balance = total_loans - total_loans_exc_slc
    """
    return impute_from_was(
        donor=was,
        recipient=frs,
        target_variable='student_loan_balance',
        common_variables=[
            'age',
            'highest_qualification',
            'region',
            'employment_status',
            'income'
        ],
        method='quantile_forest'
    )

Common Variables for WAS-FRS Imputation

Demographics (always available):

age
sex
region (UK region codes)

Economic status:

employment_status
income (or income bands)
hours_worked

Household:

household_size
num_children
tenure_type (own/rent)

Education:

highest_qualification
currently_studying

Testing

Run tests:

bash

make test

# Or pytest directly
pytest policyengine_uk_data/tests/ -v

Test structure:

bash

# Check if imputation was added
pytest policyengine_uk_data/tests/test_imputations.py::test_student_loan_imputation

Validation

After adding an imputation, validate:

1. Distribution check:

python

# Compare imputed FRS distribution to WAS source
import matplotlib.pyplot as plt

fig, (ax1, ax2) = plt.subplots(1, 2)
ax1.hist(was['my_variable'], bins=50)
ax1.set_title('WAS (source)')
ax2.hist(frs_imputed['my_variable'], bins=50)
ax2.set_title('FRS (imputed)')

2. Aggregate totals:

python

# Check population-weighted totals match administrative data
weighted_total = (frs_imputed['my_variable'] * frs_imputed['weight']).sum()
print(f"Imputed total: {weighted_total:,.0f}")
# Compare to known UK aggregate

3. Conditional relationships:

python

# Verify relationships are preserved
# E.g., student loan balance by age and qualification
frs_imputed.groupby(['age_band', 'qualification'])['student_loan_balance'].mean()

Common Patterns

Pattern 1: Simple Variable Imputation

python

# Most common: direct variable imputation
def add_variable(frs, was):
    return impute_from_was(
        donor=was,
        recipient=frs,
        target_variable='my_var',
        common_variables=['age', 'income', 'region']
    )

Pattern 2: Derived Variable Imputation

python

# When WAS has components but not the exact variable
def add_derived_variable(frs, was):
    # First derive the variable in WAS
    was['net_wealth'] = was['total_assets'] - was['total_debts']

    # Then impute
    return impute_from_was(
        donor=was,
        recipient=frs,
        target_variable='net_wealth',
        common_variables=['age', 'income', 'region']
    )

Pattern 3: Multiple Related Variables

python

# Impute several related variables together
def add_wealth_components(frs, was):
    variables = [
        'property_wealth',
        'financial_wealth',
        'pension_wealth',
        'debt'
    ]

    for var in variables:
        frs = impute_from_was(
            donor=was,
            recipient=frs,
            target_variable=var,
            common_variables=['age', 'income', 'region']
        )

    return frs

Integration with PolicyEngine UK

Usage flow:

1. Load raw FRS
   ↓
2. Add WAS imputations (wealth, student loans, etc.)
   ↓
3. Calibrate weights to administrative benchmarks
   ↓
4. Validate against known UK totals
   ↓
5. Package for policyengine-uk
   ↓
6. Use for UK policy simulations

In policyengine-uk:

python

from policyengine_uk import Microsimulation

# Uses enhanced FRS under the hood
sim = Microsimulation()
sim.calculate('student_loan_repayment', period='2026')
# Uses imputed student_loan_balance variable

Related Skills

microimpute-skill - ML imputation methods (underlying technique)
policyengine-uk-skill - UK policy model (uses this data)
microcalibrate-skill - Weight calibration (next step after imputation)
microdf-skill - Working with survey microdata

Resources

Repository: https://github.com/PolicyEngine/policyengine-uk-data Dependencies: policyengine-uk, policyengine-core, microdf, microimpute Data sources:

Maintainer

PolicyEngine Core maintainer

Source details

Full Name: PolicyEngine/policyengine-claude
Branch: main
Path in repo: skills/data-science/policyengine-uk-data-skill
License: MIT License

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

PolicyEngine/policyengine-claude

policyengine-healthcare

Healthcare program modeling in PolicyEngine-US — Medicaid, ACA marketplace, CHIP, and Medicare. Covers encoding rules, running analyses, and navigating the unique complexity of US healthcare programs. Triggers: "healthcare", "health insurance", "Medicaid", "ACA", "CHIP", "Medicare", "marketplace", "premium tax credit", "APTC", "PTC", "SLCSP", "benchmark plan", "rating area", "age curve", "family tier", "coverage gap", "Medicaid expansion", "MAGI", "medicaid_magi", "aca_magi", "medicaid_income_level", "medicaid_category", "enrollment", "takeup", "take-up", "per capita", "CSR", "cost sharing", "insurance premium", "second lowest silver", "required contribution percentage", "42 CFR", "IRC 36B", "categorical eligibility", "expansion adult", "healthcare reform", "healthcare analysis", "health policy".

26 5

Explore

PolicyEngine/policyengine-claude

policyengine-us

ALWAYS LOAD THIS SKILL FIRST before writing any PolicyEngine-US code. Contains the correct API patterns for household calculations and population simulations using the new policyengine package. Covers US federal and state taxes/benefits. Triggers: "what would", "how much would a", "benefit be", "eligible for", "qualify for", "single parent", "married couple", "family of", "household of", "if they earn", "earning $", "making $", "calculate benefits", "calculate taxes", "benefit for a", "what would I get", "what is the maximum", "what is the rate", "poverty line", "income limit", "benefit amount", "maximum benefit", "compare states", "TANF", "SNAP", "EITC", "CTC", "SSI", "WIC", "Section 8", "Medicaid", "ACA", "child tax credit", "earned income", "supplemental security", "housing voucher", "microsimulation", "population", "reform", "policy impact", "budgetary", "decile".

26 5

Explore

PolicyEngine/policyengine-claude

policyengine-uk

ALWAYS LOAD THIS SKILL FIRST before writing any PolicyEngine-UK code. Contains the correct API patterns for household calculations and population simulations using the new policyengine package (not policyengine_uk directly). Triggers: "what would", "how much would a", "benefit be", "eligible for", "qualify for", "single parent", "married couple", "family of", "household of", "if they earn", "with income of", "earning £", "making £", "calculate benefits", "calculate taxes", "benefit for a", "tax for a", "what would I get", "what would they get", "what is the rate", "what is the threshold", "personal allowance", "maximum benefit", "income limit", "benefit amount", "how much is", "Universal Credit", "child benefit", "pension credit", "housing benefit", "council tax", "income tax", "national insurance", "JSA", "ESA", "PIP", "disability living allowance", "working tax credit", "child tax credit", "Scotland", "Wales", "UK", "microsimulation", "population", "reform", "policy impact", "budgetary", "decile".

26 5

Explore

PolicyEngine/policyengine-claude

policyengine-canada

ALWAYS LOAD THIS SKILL FIRST before writing any PolicyEngine-Canada code. Contains Canadian federal and provincial tax/benefit rules for household calculations. IMPORTANT: PolicyEngine-Canada does NOT have representative population microdata. Do NOT attempt microsimulation or population-level estimates for Canada. Only provide household-level analysis (single-family impacts, eligibility, benefit amounts). Triggers: "what would", "how much would a", "benefit be", "eligible for", "qualify for", "single parent", "married couple", "family of", "household of", "if they earn", "earning $", "making $", "calculate benefits", "calculate taxes", "benefit for a", "what would I get", "what is the maximum", "what is the rate", "income limit", "benefit amount", "maximum benefit", "compare provinces", "CCB", "Canada Child Benefit", "GST credit", "HST credit", "GST/HST", "OAS", "Old Age Security", "GIS", "Guaranteed Income Supplement", "CWB", "Canada Workers Benefit", "EI", "Employment Insurance", "CPP", "Canada Pension Plan", "RRSP", "TFSA", "Ontario Child Benefit", "OCB", "Ontario Trillium Benefit", "OTB", "BC Climate Action", "Alberta Child Benefit", "Quebec", "CRA", "Canada Revenue Agency", "Canadian", "Canada", "Ontario", "British Columbia", "Alberta", "Saskatchewan", "Manitoba", "Nova Scotia", "New Brunswick", "PEI", "Newfoundland", "Yukon", "NWT", "Nunavut", "provincial tax", "federal tax Canada".

26 5

Explore

PolicyEngine/policyengine-claude

policyengine-ui-kit-consumer

This skill should be used when setting up a new project that uses @policyengine/ui-kit, debugging CSS or styling issues in a consumer app, or when Tailwind utility classes are not being generated. Also use when creating globals.css, configuring PostCSS, or troubleshooting "no styles", "no spacing", or "no layout" problems. Triggers: "ui-kit import", "globals.css setup", "Tailwind not working", "styles not applying", "utility classes missing", "setup ui-kit", "PostCSS config", "no styling", "CSS broken", "import ui-kit", "theme.css", "no layout", "no spacing", "@tailwindcss/postcss"

26 5

Explore

PolicyEngine/policyengine-claude

policyengine-tailwind-shadcn

Tailwind CSS v4 + shadcn/ui integration patterns for PolicyEngine frontend projects. Covers @theme namespaces, CSS variable conventions, SVG var() usage, and common mistakes. Triggers: "Tailwind v4", "@theme", "shadcn", "CSS variables", "design tokens CSS", "theme.css", "@theme inline"

26 5

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

PolicyEngine UK Data

For Users

What is policyengine-uk-data?

For Analysts

Repository

Structure

Installation

For Contributors

Imputation Pattern

Example: Student Loan Imputation

Common Variables for WAS-FRS Imputation

Testing

Validation

Common Patterns

Pattern 1: Simple Variable Imputation

Pattern 2: Derived Variable Imputation

Pattern 3: Multiple Related Variables

Integration with PolicyEngine UK

Related Skills

Resources

Recommended Agent Skills

policyengine-healthcare

policyengine-us

policyengine-uk

policyengine-canada

policyengine-ui-kit-consumer

policyengine-tailwind-shadcn