Agent skill

blueprint

Define reusable Airflow task group templates with Pydantic validation and compose DAGs from YAML. Use when creating blueprint templates, composing DAGs from YAML, validating configurations, or enabling no-code DAG authoring for non-engineers.

Stars 295
Forks 34

Install this agent skill to your Project

npx add-skill https://github.com/astronomer/agents/tree/main/skills/blueprint

SKILL.md

Blueprint Implementation

You are helping a user work with Blueprint, a system for composing Airflow DAGs from YAML using reusable Python templates. Execute steps in order and prefer the simplest configuration that meets the user's needs.

Package: airflow-blueprint on PyPI Repo: https://github.com/astronomer/blueprint Requires: Python 3.10+, Airflow 2.5+, Blueprint 0.1.1+

Before Starting

Confirm with the user:

  1. Airflow version ≥2.5
  2. Python version ≥3.10
  3. Use case: Blueprint is for standardized, validated templates. If user needs full Airflow flexibility, suggest writing DAGs directly or using DAG Factory instead.

Determine What the User Needs

User Request Action
"Create a blueprint" / "Define a template" Go to Creating Blueprints
"Create a DAG from YAML" / "Compose steps" Go to Composing DAGs in YAML
"Validate my YAML" / "Lint blueprint" Go to Validation Commands
"Set up blueprint in my project" Go to Project Setup
"Version my blueprint" Go to Versioning
"Generate schema" / "Astro IDE setup" Go to Schema Generation
Blueprint errors / troubleshooting Go to Troubleshooting

Project Setup

If the user is starting fresh, guide them through setup:

1. Install the Package

bash
# Add to requirements.txt
airflow-blueprint>=0.1.1

# Or install directly
pip install airflow-blueprint

2. Create the Loader

Create dags/loader.py:

python
from blueprint import build_all

build_all(
    dag_defaults={
        "default_args": {"owner": "data-team", "retries": 2},
    }
)

3. Verify Installation

bash
uvx --from airflow-blueprint blueprint list

If no blueprints found, user needs to create blueprint classes first.


Creating Blueprints

When user wants to create a new blueprint template:

Blueprint Structure

python
# dags/templates/my_blueprints.py
from airflow.operators.bash import BashOperator
from airflow.utils.task_group import TaskGroup
from blueprint import Blueprint, BaseModel, Field

class MyConfig(BaseModel):
    # Required field with description (used in CLI output and JSON schema)
    source_table: str = Field(description="Source table name")
    # Optional field with default and validation
    batch_size: int = Field(default=1000, ge=1)

class MyBlueprint(Blueprint[MyConfig]):
    """Docstring becomes blueprint description."""

    def render(self, config: MyConfig) -> TaskGroup:
        with TaskGroup(group_id=self.step_id) as group:
            BashOperator(
                task_id="my_task",
                bash_command=f"echo '{config.source_table}'"
            )
        return group

Key Rules

Element Requirement
Config class Must inherit from BaseModel
Blueprint class Must inherit from Blueprint[ConfigClass]
render() method Must return TaskGroup or BaseOperator
Task IDs Use self.step_id for the group/task ID

Recommend Strict Validation

Suggest adding extra="forbid" to catch YAML typos:

python
from pydantic import ConfigDict

class MyConfig(BaseModel):
    model_config = ConfigDict(extra="forbid")
    # fields...

Composing DAGs in YAML

When user wants to create a DAG from blueprints:

YAML Structure

yaml
# dags/my_pipeline.dag.yaml
dag_id: my_pipeline
schedule: "@daily"
tags: [etl]

steps:
  step_one:
    blueprint: my_blueprint
    source_table: raw.customers
    batch_size: 500

  step_two:
    blueprint: another_blueprint
    depends_on: [step_one]
    target: analytics.output

Reserved Keys in Steps

Key Purpose
blueprint Template name (required)
depends_on List of upstream step names
version Pin to specific blueprint version

Everything else passes to the blueprint's config.

Jinja2 Support

YAML supports Airflow context:

yaml
dag_id: "{{ env.get('ENV', 'dev') }}_pipeline"
schedule: "{{ var.value.schedule | default('@daily') }}"

Validation Commands

Run CLI commands with uvx:

bash
uvx --from airflow-blueprint blueprint <command>
Command When to Use
blueprint list Show available blueprints
blueprint describe <name> Show config schema for a blueprint
blueprint describe <name> -v N Show schema for specific version
blueprint lint Validate all *.dag.yaml files
blueprint lint <path> Validate specific file
blueprint schema <name> Generate JSON schema
blueprint new Interactive DAG YAML creation

Validation Workflow

bash
# Check all YAML files
blueprint lint

# Expected output for valid files:
# PASS customer_pipeline.dag.yaml (dag_id=customer_pipeline)

Versioning

When user needs to version blueprints for backwards compatibility:

Version Naming Convention

  • v1: MyBlueprint (no suffix)
  • v2: MyBlueprintV2
  • v3: MyBlueprintV3
python
# v1 - original
class ExtractConfig(BaseModel):
    source_table: str

class Extract(Blueprint[ExtractConfig]):
    def render(self, config): ...

# v2 - breaking changes, new class
class ExtractV2Config(BaseModel):
    sources: list[dict]  # Different schema

class ExtractV2(Blueprint[ExtractV2Config]):
    def render(self, config): ...

Using Versions in YAML

yaml
steps:
  # Pin to v1
  legacy_extract:
    blueprint: extract
    version: 1
    source_table: raw.data

  # Use latest (v2)
  new_extract:
    blueprint: extract
    sources: [{table: orders}]

Schema Generation

Generate JSON schemas for editor autocompletion or external tooling:

bash
# Generate schema for a blueprint
blueprint schema extract > extract.schema.json

Astro Project Auto-Detection

After creating or modifying a blueprint, automatically check if the project is an Astro project by looking for a .astro/ directory (created by astro dev init).

If the project is an Astro project, automatically regenerate schemas without prompting:

bash
mkdir -p blueprint/generated-schemas
# For each name from `blueprint list`: blueprint schema NAME > blueprint/generated-schemas/NAME.schema.json

The Astro IDE reads blueprint/generated-schemas/ to render configuration forms. Keeping schemas in sync ensures the visual builder always reflects the latest blueprint configs.

If you cannot determine whether the project is an Astro project, ask the user once and remember for the rest of the session.


Troubleshooting

"Blueprint not found"

Cause: Blueprint class not in Python path.

Fix: Check template directory or use --template-dir:

bash
blueprint list --template-dir dags/templates/

"Extra inputs are not permitted"

Cause: YAML field name typo with extra="forbid" enabled.

Fix: Run blueprint describe <name> to see valid field names.

DAG not appearing in Airflow

Cause: Missing or broken loader.

Fix: Ensure dags/loader.py exists and calls build_all():

python
from blueprint import build_all
build_all()

"Cyclic dependency detected"

Cause: Circular depends_on references.

Fix: Review step dependencies and remove cycles.

Debugging in Airflow UI

Every Blueprint task has extra fields in Rendered Template:

  • blueprint_step_config - resolved YAML config
  • blueprint_step_code - Python source of blueprint

Verification Checklist

Before finishing, verify with user:

  • blueprint list shows their templates
  • blueprint lint passes for all YAML files
  • dags/loader.py exists with build_all()
  • DAG appears in Airflow UI without parse errors

Reference

Astro IDE

Expand your agent's capabilities with these related and highly-rated skills.

astronomer/agents

testing-dags

Complex DAG testing workflows with debugging and fixing cycles. Use for multi-step testing requests like "test this dag and fix it if it fails", "test and debug", "run the pipeline and troubleshoot issues". For simple test requests ("test dag", "run dag"), the airflow entrypoint skill handles it directly. This skill is for iterative test-debug-fix cycles.

295 34
Explore
astronomer/agents

managing-astro-local-env

Manage local Airflow environment with Astro CLI. Use when the user wants to start, stop, or restart Airflow, view logs, troubleshoot containers, or fix environment issues. For project setup, see setting-up-astro-project.

295 34
Explore
astronomer/agents

analyzing-data

Queries data warehouse and answers business questions about data. Handles questions requiring database/warehouse queries including "who uses X", "how many Y", "show me Z", "find customers", "what is the count", data lookups, metrics, trends, or SQL analysis.

295 34
Explore
astronomer/agents

setting-up-astro-project

Initialize and configure Astro/Airflow projects. Use when the user wants to create a new project, set up dependencies, configure connections/variables, or understand project structure. For running the local environment, see managing-astro-local-env.

295 34
Explore
astronomer/agents

tracing-upstream-lineage

Trace upstream data lineage. Use when the user asks where data comes from, what feeds a table, upstream dependencies, data sources, or needs to understand data origins.

295 34
Explore
astronomer/agents

airflow-plugins

Build Airflow 3.1+ plugins that embed FastAPI apps, custom UI pages, React components, middleware, macros, and operator links directly into the Airflow UI. Use this skill whenever the user wants to create an Airflow plugin, add a custom UI page or nav entry to Airflow, build FastAPI-backed endpoints inside Airflow, serve static assets from a plugin, embed a React app in the Airflow UI, add middleware to the Airflow API server, create custom operator extra links, or call the Airflow REST API from inside a plugin. Also trigger when the user mentions AirflowPlugin, fastapi_apps, external_views, react_apps, plugin registration, or embedding a web app in Airflow 3.1+. If someone is building anything custom inside Airflow 3.1+ that involves Python and a browser-facing interface, this skill almost certainly applies.

295 34
Explore

Didn't find tool you were looking for?

Be as detailed as possible for better results