Agent skill
dbt-model-builder
Create dbt models following FF Analytics Kimball patterns and 2×2 stat model. This skill should be used when creating staging models, core facts/dimensions, or analytical marts. Guides through model creation with proper grain, tests, External Parquet configuration, and per-model YAML documentation using dbt 1.10+ syntax.
Install this agent skill to your Project
npx add-skill https://github.com/zazu-22/ff_data_analytics/tree/main/.claude/_archive_skills/dbt-model-builder
SKILL.md
dbt Model Builder
Create complete dbt models for the Fantasy Football Analytics project following Kimball dimensional modeling and the 2×2 stat model (actuals/projections × real-world/fantasy).
When to Use This Skill
Use this skill proactively when:
- Creating new dbt models (staging, core, marts)
- User asks to "create a model for {entity/process}"
- User mentions dbt modeling, dimensional modeling, or Kimball patterns
- Adding analytics-ready views or transformations
- Implementing 2×2 model quadrants (actuals/projections, real-world/fantasy scoring)
dbt Modeling Approach
The FF Analytics project follows:
- Kimball Dimensional Modeling - Facts, dimensions, conformed entities
- 2×2 Stat Model - Separate facts for actuals vs projections
- Per-Model YAML - One
_<model>.ymlfile per model - External Parquet - Large models use external=true with partitioning
- dbt 1.10+ syntax - Test arguments wrapped in
arguments:block
Model Building Workflows
Workflow 1: Create Staging Model
Staging models normalize raw provider data.
Steps:
-
Identify source: Determine provider and dataset
-
Design grain: Define one row per...
-
Create SQL using
assets/staging_template.sql:- Name:
stg_{provider}__{dataset}.sql - Materialize as view
- Select from
{{ source('{provider}', '{dataset}') }} - Rename columns to standard names
- Name:
-
Create YAML using
assets/staging_yaml_template.yml:- Name:
_stg_{provider}__{dataset}.yml - Document grain and source
- Add not_null, unique tests for PKs
- Add accepted_values for enums
- Name:
-
Run and test:
bashmake dbt-run --select stg_{provider}__{dataset} make dbt-test --select stg_{provider}__{dataset}
Workflow 2: Create Fact Table
Fact tables capture measurable events/processes.
Steps:
-
Design grain: Define composite primary key (e.g., player_id + game_id + stat_name)
-
Map foreign keys: Join to conformed dimensions (
dim_player,dim_team, etc.) -
Create SQL using
assets/fact_template.sql:- Name:
fact_{process}.sql - Config:
materialized='table', external=true, partition_by=['season','week'] - Join staging to dimensions for FK resolution
- Select grain keys + measures
- Name:
-
Create YAML using
assets/fact_yaml_template.yml:- Document grain explicitly
- Add
dbt_utils.unique_combination_of_columnsfor grain test - Add relationship tests for all FKs
- Add not_null for required measures
-
Run and test:
bashmake dbt-run --select fact_{process} make dbt-test --select fact_{process}
Critical: Fact tables MUST have grain uniqueness test with dbt 1.10+ syntax:
data_tests:
- dbt_utils.unique_combination_of_columns:
arguments:
combination_of_columns:
- column1
- column2
config:
severity: error
Workflow 3: Create Dimension Table
Dimensions provide descriptive context for facts.
Steps:
- Determine SCD type: Type 1 (replace) or Type 2 (historical tracking)
- Design natural key: Business key for the entity
- Create SQL using
assets/dim_template.sql:- Name:
dim_{entity}.sql - Generate surrogate key with
dbt_utils.generate_surrogate_key() - For SCD Type 2: Add valid_from, valid_to, is_current
- Name:
- Create YAML:
- Document grain: "one row per {entity}"
- Add unique test on surrogate key
- Add not_null on natural key
- Run and test
SCD Type 2 pattern:
- Track changes over time with validity dates
- Include version_number for multiple versions
- Set is_current flag for latest version
Workflow 4: Create Analytical Mart (2×2 Model)
Marts provide wide-format, analytics-ready data.
2×2 Model Quadrants:
mart_real_world_actuals- NFL stats (actuals)mart_real_world_projections- Projected NFL statsmart_fantasy_actuals- Fantasy points (actuals, apply scoring rules)mart_fantasy_projections- Projected fantasy points
Steps:
- Select quadrant: Determine actuals vs projections, real-world vs fantasy
- Pivot fact table: Convert long-form stats to wide columns
- Join dimensions: Enrich with descriptive attributes
- Apply scoring (fantasy quadrants only):
- Join
dim_scoring_rule - Calculate points:
{stat} * {points_per_stat}
- Join
- Create SQL using
assets/mart_template.sql:- Partition by season
- Wide format with one column per stat
- Run and test
Example pivot:
SUM(CASE WHEN stat_name = 'passing_yards' THEN stat_value END) AS passing_yards,
SUM(CASE WHEN stat_name = 'passing_tds' THEN stat_value END) AS passing_tds
Resources Provided
references/
Real models from the codebase:
- example_staging_model.sql - stg_ktc_assets
- example_staging_yaml.yml - YAML with tests
- example_fact_model.sql - Fact table example
- example_dim_model.sql - Dimension example
- example_mart_model.sql - Mart example
assets/
Templates for creating models:
- staging_template.sql - Staging model SQL
- staging_yaml_template.yml - Staging YAML with tests
- fact_template.sql - Fact table SQL with FK joins
- fact_yaml_template.yml - Fact YAML with grain test
- dim_template.sql - Dimension SQL with SCD Type 2
- mart_template.sql - Mart SQL with pivot pattern
Best Practices
Grain Declaration
CRITICAL: Every model must explicitly declare grain:
- In SQL comments:
-- Grain: one row per... - In YAML description
- In grain uniqueness test (facts)
Testing Strategy
Staging models:
- not_null on all PKs
- unique on single-column PKs
- accepted_values on enums
Fact tables:
- dbt_utils.unique_combination_of_columns (grain test)
- relationships to all dimensions
- not_null on FKs and measures
Dimensions:
- unique on surrogate key
- not_null on natural key
dbt 1.10+ Test Syntax
CRITICAL: Follow these two rules to avoid deprecation warnings:
- Use
data_tests:key (nottests:): dbt 1.5+ introduceddata_tests:to distinguish fromunit_tests: - Wrap test parameters in
arguments:: dbt 1.10+ requires this for all generic tests with parameters
# CORRECT - Column-level tests
columns:
- name: position
data_tests: # Use data_tests:, not tests:
- not_null
- accepted_values:
arguments: # Arguments must be nested
values: ['QB', 'RB', 'WR', 'TE']
- name: player_id
data_tests:
- not_null
- relationships:
arguments: # Wrap to, field in arguments:
to: ref('dim_player')
field: player_id
config: # config: is sibling to arguments:
where: "player_id > 0"
# CORRECT - Model-level tests
data_tests: # Use data_tests:, not tests:
- dbt_utils.unique_combination_of_columns:
arguments:
combination_of_columns:
- player_key
- game_id
# WRONG - Deprecated syntax (will cause warnings)
columns:
- name: position
tests: # WRONG - should be data_tests:
- accepted_values:
values: ['QB', 'RB'] # WRONG - should be under arguments:
tests: # WRONG - should be data_tests:
- relationships:
to: ref('dim_player') # WRONG - should be under arguments:
field: player_id
Key Points:
- Always use
data_tests:(nottests:) arguments:wraps test parameters (to, field, values, combination_of_columns)config:is a sibling toarguments:, not nested insidenot_nullanduniquehave no arguments, use directly
External Parquet Configuration
Large models use External Parquet:
{{ config(
materialized='table',
external=true,
partition_by=['season', 'week']
) }}
Naming Conventions
- Staging:
stg_{provider}__{dataset} - Facts:
fact_{process}(e.g.,fact_player_stats) - Dimensions:
dim_{entity}(e.g.,dim_player) - Marts:
mart_{purpose}(e.g.,mart_fantasy_actuals_weekly) - YAML:
_<model_name>.yml
Integration with Other Skills
- data-ingestion-builder - Create staging models after adding providers
- data-quality-test-generator - Enhance testing beyond basics
Recommended Agent Skills
Expand your agent's capabilities with these related and highly-rated skills.
notebook-creator
Create Jupyter notebooks for FF Analytics following project conventions. Use this skill when the user requests analysis notebooks for player evaluation, roster health, trade scenarios, market trends, or projection quality analysis. Guides through notebook structure, DuckDB connections, mart queries, visualization standards, and freshness banner patterns.
adr-creator
Create Architecture Decision Records (ADRs) documenting significant technical decisions for the FF Analytics platform. This skill should be used when making architectural choices, evaluating alternatives for data models or infrastructure, documenting trade-offs, or when the user asks "should we use X or Y approach?" Guides through the ADR creation workflow from context gathering to documentation.
sprint-planner
Design and structure development sprints with atomic, LLM-ready task files. This skill should be used when the user wants to plan a new development sprint, break down complex projects into executable tasks, or create a sprint execution framework. Produces sprint plans, atomic task specifications, and corresponding executor skills for LLM coding agents.
data-quality-test-generator
Generate comprehensive dbt test suites following FF Analytics data quality standards and dbt 1.10+ syntax. This skill should be used when creating tests for new dbt models, adding tests to existing models, standardizing test coverage, or implementing data quality gates. Covers grain uniqueness, FK relationships, enum validation, and freshness tests.
strategic-planner
Design comprehensive technical specifications and strategic plans for data architecture and analytics projects. This skill should be used when planning major features, creating SPEC documents, assessing product requirements, breaking down complex projects into phases, or documenting architectural strategies like SPEC-1. Guides through requirements gathering, MoSCoW prioritization, phase planning, and open items tracking.
sprint-1-executor
Didn't find tool you were looking for?