Agent skill

dbt-model-builder

Create dbt models following FF Analytics Kimball patterns and 2×2 stat model. This skill should be used when creating staging models, core facts/dimensions, or analytical marts. Guides through model creation with proper grain, tests, External Parquet configuration, and per-model YAML documentation using dbt 1.10+ syntax.

Stars 0
Forks 0

Install this agent skill to your Project

npx add-skill https://github.com/zazu-22/ff_data_analytics/tree/main/.claude/_archive_skills/dbt-model-builder

SKILL.md

dbt Model Builder

Create complete dbt models for the Fantasy Football Analytics project following Kimball dimensional modeling and the 2×2 stat model (actuals/projections × real-world/fantasy).

When to Use This Skill

Use this skill proactively when:

  • Creating new dbt models (staging, core, marts)
  • User asks to "create a model for {entity/process}"
  • User mentions dbt modeling, dimensional modeling, or Kimball patterns
  • Adding analytics-ready views or transformations
  • Implementing 2×2 model quadrants (actuals/projections, real-world/fantasy scoring)

dbt Modeling Approach

The FF Analytics project follows:

  1. Kimball Dimensional Modeling - Facts, dimensions, conformed entities
  2. 2×2 Stat Model - Separate facts for actuals vs projections
  3. Per-Model YAML - One _<model>.yml file per model
  4. External Parquet - Large models use external=true with partitioning
  5. dbt 1.10+ syntax - Test arguments wrapped in arguments: block

Model Building Workflows

Workflow 1: Create Staging Model

Staging models normalize raw provider data.

Steps:

  1. Identify source: Determine provider and dataset

  2. Design grain: Define one row per...

  3. Create SQL using assets/staging_template.sql:

    • Name: stg_{provider}__{dataset}.sql
    • Materialize as view
    • Select from {{ source('{provider}', '{dataset}') }}
    • Rename columns to standard names
  4. Create YAML using assets/staging_yaml_template.yml:

    • Name: _stg_{provider}__{dataset}.yml
    • Document grain and source
    • Add not_null, unique tests for PKs
    • Add accepted_values for enums
  5. Run and test:

    bash
    make dbt-run --select stg_{provider}__{dataset}
    make dbt-test --select stg_{provider}__{dataset}
    

Workflow 2: Create Fact Table

Fact tables capture measurable events/processes.

Steps:

  1. Design grain: Define composite primary key (e.g., player_id + game_id + stat_name)

  2. Map foreign keys: Join to conformed dimensions (dim_player, dim_team, etc.)

  3. Create SQL using assets/fact_template.sql:

    • Name: fact_{process}.sql
    • Config: materialized='table', external=true, partition_by=['season','week']
    • Join staging to dimensions for FK resolution
    • Select grain keys + measures
  4. Create YAML using assets/fact_yaml_template.yml:

    • Document grain explicitly
    • Add dbt_utils.unique_combination_of_columns for grain test
    • Add relationship tests for all FKs
    • Add not_null for required measures
  5. Run and test:

    bash
    make dbt-run --select fact_{process}
    make dbt-test --select fact_{process}
    

Critical: Fact tables MUST have grain uniqueness test with dbt 1.10+ syntax:

yaml
data_tests:
  - dbt_utils.unique_combination_of_columns:
      arguments:
        combination_of_columns:
          - column1
          - column2
      config:
        severity: error

Workflow 3: Create Dimension Table

Dimensions provide descriptive context for facts.

Steps:

  1. Determine SCD type: Type 1 (replace) or Type 2 (historical tracking)
  2. Design natural key: Business key for the entity
  3. Create SQL using assets/dim_template.sql:
    • Name: dim_{entity}.sql
    • Generate surrogate key with dbt_utils.generate_surrogate_key()
    • For SCD Type 2: Add valid_from, valid_to, is_current
  4. Create YAML:
    • Document grain: "one row per {entity}"
    • Add unique test on surrogate key
    • Add not_null on natural key
  5. Run and test

SCD Type 2 pattern:

  • Track changes over time with validity dates
  • Include version_number for multiple versions
  • Set is_current flag for latest version

Workflow 4: Create Analytical Mart (2×2 Model)

Marts provide wide-format, analytics-ready data.

2×2 Model Quadrants:

  • mart_real_world_actuals - NFL stats (actuals)
  • mart_real_world_projections - Projected NFL stats
  • mart_fantasy_actuals - Fantasy points (actuals, apply scoring rules)
  • mart_fantasy_projections - Projected fantasy points

Steps:

  1. Select quadrant: Determine actuals vs projections, real-world vs fantasy
  2. Pivot fact table: Convert long-form stats to wide columns
  3. Join dimensions: Enrich with descriptive attributes
  4. Apply scoring (fantasy quadrants only):
    • Join dim_scoring_rule
    • Calculate points: {stat} * {points_per_stat}
  5. Create SQL using assets/mart_template.sql:
    • Partition by season
    • Wide format with one column per stat
  6. Run and test

Example pivot:

sql
SUM(CASE WHEN stat_name = 'passing_yards' THEN stat_value END) AS passing_yards,
SUM(CASE WHEN stat_name = 'passing_tds' THEN stat_value END) AS passing_tds

Resources Provided

references/

Real models from the codebase:

  • example_staging_model.sql - stg_ktc_assets
  • example_staging_yaml.yml - YAML with tests
  • example_fact_model.sql - Fact table example
  • example_dim_model.sql - Dimension example
  • example_mart_model.sql - Mart example

assets/

Templates for creating models:

  • staging_template.sql - Staging model SQL
  • staging_yaml_template.yml - Staging YAML with tests
  • fact_template.sql - Fact table SQL with FK joins
  • fact_yaml_template.yml - Fact YAML with grain test
  • dim_template.sql - Dimension SQL with SCD Type 2
  • mart_template.sql - Mart SQL with pivot pattern

Best Practices

Grain Declaration

CRITICAL: Every model must explicitly declare grain:

  • In SQL comments: -- Grain: one row per...
  • In YAML description
  • In grain uniqueness test (facts)

Testing Strategy

Staging models:

  • not_null on all PKs
  • unique on single-column PKs
  • accepted_values on enums

Fact tables:

  • dbt_utils.unique_combination_of_columns (grain test)
  • relationships to all dimensions
  • not_null on FKs and measures

Dimensions:

  • unique on surrogate key
  • not_null on natural key

dbt 1.10+ Test Syntax

CRITICAL: Follow these two rules to avoid deprecation warnings:

  1. Use data_tests: key (not tests:): dbt 1.5+ introduced data_tests: to distinguish from unit_tests:
  2. Wrap test parameters in arguments:: dbt 1.10+ requires this for all generic tests with parameters
yaml
# CORRECT - Column-level tests
columns:
  - name: position
    data_tests:  # Use data_tests:, not tests:
      - not_null
      - accepted_values:
          arguments:  # Arguments must be nested
            values: ['QB', 'RB', 'WR', 'TE']

  - name: player_id
    data_tests:
      - not_null
      - relationships:
          arguments:  # Wrap to, field in arguments:
            to: ref('dim_player')
            field: player_id
          config:  # config: is sibling to arguments:
            where: "player_id > 0"

# CORRECT - Model-level tests
data_tests:  # Use data_tests:, not tests:
  - dbt_utils.unique_combination_of_columns:
      arguments:
        combination_of_columns:
          - player_key
          - game_id

# WRONG - Deprecated syntax (will cause warnings)
columns:
  - name: position
    tests:  # WRONG - should be data_tests:
      - accepted_values:
          values: ['QB', 'RB']  # WRONG - should be under arguments:

tests:  # WRONG - should be data_tests:
  - relationships:
      to: ref('dim_player')  # WRONG - should be under arguments:
      field: player_id

Key Points:

  • Always use data_tests: (not tests:)
  • arguments: wraps test parameters (to, field, values, combination_of_columns)
  • config: is a sibling to arguments:, not nested inside
  • not_null and unique have no arguments, use directly

External Parquet Configuration

Large models use External Parquet:

sql
{{ config(
    materialized='table',
    external=true,
    partition_by=['season', 'week']
) }}

Naming Conventions

  • Staging: stg_{provider}__{dataset}
  • Facts: fact_{process} (e.g., fact_player_stats)
  • Dimensions: dim_{entity} (e.g., dim_player)
  • Marts: mart_{purpose} (e.g., mart_fantasy_actuals_weekly)
  • YAML: _<model_name>.yml

Integration with Other Skills

  • data-ingestion-builder - Create staging models after adding providers
  • data-quality-test-generator - Enhance testing beyond basics

Expand your agent's capabilities with these related and highly-rated skills.

zazu-22/ff_data_analytics

notebook-creator

Create Jupyter notebooks for FF Analytics following project conventions. Use this skill when the user requests analysis notebooks for player evaluation, roster health, trade scenarios, market trends, or projection quality analysis. Guides through notebook structure, DuckDB connections, mart queries, visualization standards, and freshness banner patterns.

0 0
Explore
zazu-22/ff_data_analytics

adr-creator

Create Architecture Decision Records (ADRs) documenting significant technical decisions for the FF Analytics platform. This skill should be used when making architectural choices, evaluating alternatives for data models or infrastructure, documenting trade-offs, or when the user asks "should we use X or Y approach?" Guides through the ADR creation workflow from context gathering to documentation.

0 0
Explore
zazu-22/ff_data_analytics

sprint-planner

Design and structure development sprints with atomic, LLM-ready task files. This skill should be used when the user wants to plan a new development sprint, break down complex projects into executable tasks, or create a sprint execution framework. Produces sprint plans, atomic task specifications, and corresponding executor skills for LLM coding agents.

0 0
Explore
zazu-22/ff_data_analytics

data-quality-test-generator

Generate comprehensive dbt test suites following FF Analytics data quality standards and dbt 1.10+ syntax. This skill should be used when creating tests for new dbt models, adding tests to existing models, standardizing test coverage, or implementing data quality gates. Covers grain uniqueness, FK relationships, enum validation, and freshness tests.

0 0
Explore
zazu-22/ff_data_analytics

strategic-planner

Design comprehensive technical specifications and strategic plans for data architecture and analytics projects. This skill should be used when planning major features, creating SPEC documents, assessing product requirements, breaking down complex projects into phases, or documenting architectural strategies like SPEC-1. Guides through requirements gathering, MoSCoW prioritization, phase planning, and open items tracking.

0 0
Explore
zazu-22/ff_data_analytics

sprint-1-executor

0 0
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results