Back to Skills

dbt-architecture

dbt project structure using medallion architecture (bronze/silver/gold layers). Use this skill when planning project organization, establishing folder structure, defining naming conventions, implementing layer-based configuration, or ensuring proper model dependencies and architectural patterns.

23 stars
6 forks
Updated 1d ago

Install

git clone https://github.com/sfc-gh-dflippo/snowflake-dbt-demo ~/.claude/skills/snowflake-dbt-demo

SKILL.md


name: dbt-architecture description: dbt project structure using medallion architecture (bronze/silver/gold layers). Use this skill when planning project organization, establishing folder structure, defining naming conventions, implementing layer-based configuration, or ensuring proper model dependencies and architectural patterns.

dbt Architecture

Purpose

Transform AI agents into experts on dbt project architecture and medallion layer patterns, providing guidance on structuring production-grade dbt projects with proper layer separation, naming conventions, and configuration strategies.

When to Use This Skill

Activate this skill when users ask about:

  • Planning dbt project structure and folder organization
  • Implementing medallion architecture (bronze/silver/gold)
  • Establishing naming conventions for models and columns
  • Configuring folder-level settings in dbt_project.yml
  • Ensuring proper model dependencies and data flow
  • Understanding layer separation and architectural patterns
  • Setting up tag inheritance strategies

Core Philosophy: Medallion Architecture + Best Practices Integration

Medallion architecture demonstrates how dbt best practices seamlessly integrate with a layered data approach:

  • Bronze Layer = Staging Models (stg_) - One-to-one source relationships
  • Silver Layer = Intermediate Models (int_) - Business logic transformations
  • Gold Layer = Marts (dim_, fct_) - Business-ready data products

Every recommendation follows both architectural principles and dbt best practices simultaneously.


Medallion Architecture Quick Reference

Three Layers

Bronze (Staging):

  • Naming: stg_{source}__{table}
  • Materialization: ephemeral
  • Purpose: One-to-one source cleaning
  • Rules: No joins, no business logic

Silver (Intermediate):

  • Naming: int_{entity}__{description}
  • Materialization: ephemeral or table
  • Purpose: Business logic, enrichment
  • Rules: No direct source references

Gold (Marts):

  • Naming: dim_{entity} or fct_{process}
  • Materialization: table or incremental
  • Purpose: Business-ready data products
  • Rules: Fully tested, documented, optimized

Critical Architectural Rules

Always enforce these patterns:

  1. βœ… No Direct Joins to Source - Models reference staging (ref('stg_*')), never source() directly
  2. βœ… One-to-One Staging - Each source table has exactly ONE staging model
  3. βœ… Proper Layering - Clear flow: staging β†’ intermediate β†’ marts
  4. βœ… Standardized Naming - Consistent stg_, int_, dim_, fct_ prefixes
  5. βœ… Use ref() and source() - No hard-coded table references
  6. βœ… Folder-Level Configuration - Set common settings in dbt_project.yml

Official dbt Documentation: How we structure our dbt projects


Bronze Layer: Staging Models

Purpose: One-to-one relationship with source tables. Light cleaning and standardization only.

Materialization: ephemeral (compiled as CTEs)

Naming: stg_{source}__{table}.sql

Template

-- models/bronze/stg_tpc_h__customers.sql
{{ config(materialized='ephemeral') }}

select
    -- Primary key (renamed)
    c_custkey as customer_id,

    -- Attributes (cast and renamed)
    c_name as customer_name,
    c_address as customer_address,
    c_phone as phone_number,
    c_acctbal as account_balance,

    -- Metadata
    current_timestamp() as dbt_loaded_at

from {{ source('tpc_h', 'customer') }}

Rules

βœ… DO:

  • One source table β†’ One staging model
  • Reference sources using {{ source() }}
  • Rename columns to standard naming
  • Cast data types
  • Basic cleaning (trim, upper/lower)

❌ DON'T:

  • Join between sources
  • Add business logic
  • Aggregate data
  • Hard-code table names

Silver Layer: Intermediate Models

Purpose: Reusable business logic and complex transformations. Sits between staging and marts.

Materialization: ephemeral (reusable logic) or table (complex computations)

Naming: int_{entity}__{description}.sql

Template

-- models/silver/int_customers__with_orders.sql
{{ config(materialized='ephemeral') }}

with customers as (
    select * from {{ ref('stg_tpc_h__customers') }}
),

orders as (
    select * from {{ ref('stg_tpc_h__orders') }}
),

customer_metrics as (
    select
        customer_id,
        count(*) as total_orders,
        sum(order_total) as lifetime_value,
        min(order_date) as first_order_date
    from orders
    group by customer_id
)

select
    c.customer_id,
    c.customer_name,
    coalesce(m.total_orders, 0) as total_orders,
    coalesce(m.lifetime_value, 0) as lifetime_value,
    m.first_order_date
from customers c
left join customer_metrics m on c.customer_id = m.customer_id

Rules

βœ… DO:

  • Reference staging + other intermediate models
  • Add business logic and aggregations
  • Create reusable components
  • Use CTEs for clarity

❌ DON'T:

  • Reference sources directly
  • Add final presentation logic
  • Create one-time-use models

Gold Layer: Marts Models

Purpose: Business-ready data products optimized for BI tools and end users.

Materialization: table (dimensions) or incremental (large facts)

Naming: dim_{entity} (dimensions), fct_{process} (facts)

Dimension Template

-- models/gold/dim_customers.sql
{{ config(materialized='table') }}

with customers as (
    select * from {{ ref('int_customers__with_orders') }}
)

select
    -- Primary key
    customer_id,

    -- Attributes
    customer_name,
    customer_email,

    -- Metrics
    total_orders,
    lifetime_value,
    first_order_date,

    -- Business classification
    case
        when lifetime_value >= 5000 then 'gold'
        when lifetime_value >= 1000 then 'silver'
        else 'bronze'
    end as customer_tier,

    -- Metadata
    current_timestamp() as dbt_updated_at
from customers

Fact Template

-- models/gold/fct_orders.sql
{{ config(
    materialized='incremental',
    unique_key='order_id',
    cluster_by=['order_date', 'customer_id']
) }}

select
    order_id,
    customer_id,
    order_date,
    order_status,
    order_total,
    current_timestamp() as dbt_updated_at
from {{ ref('stg_tpc_h__orders') }}

{% if is_incremental() %}
    where order_date > (select max(order_date) from {{ this }})
{% endif %}

Rules

βœ… DO:

  • Reference staging, intermediate, and other marts
  • Add final business logic
  • Optimize for query performance (clustering)
  • Test comprehensively
  • Document for business users

❌ DON'T:

  • Reference sources directly
  • Create unnecessary complexity

Naming Conventions

Model Naming

LayerPrefixExamplePurpose
Bronze/Stagingstg_stg_tpc_h__customersClean source data
Silver/Intermediateint_int_customers__with_ordersBusiness logic
Gold/Dimensionsdim_dim_customersBusiness entities
Gold/Factsfct_fct_ordersBusiness events

Column Naming Standards

Primary & Foreign Keys:

  • {entity}_id - customer_id, order_id, product_id
  • Foreign keys use same naming as primary key in related table

Boolean Flags:

  • is_{condition} - is_active, is_deleted, is_first_order
  • has_{attribute} - has_orders, has_discount

Dates & Timestamps:

  • {event}_date - order_date, created_date
  • {event}_at - created_at, updated_at, deleted_at
  • Always use UTC timezone suffix if needed - created_at_utc

Metrics & Aggregates:

  • {metric}_count - order_count, customer_count
  • {metric}_amount - total_amount, discount_amount
  • Include currency suffix if applicable - amount_usd, price_eur

Row Numbers & Sequences:

  • {entity}_row_number - order_row_number
  • {entity}_seq_number - sequence_number

Consistency Rules

βœ… DO:

  • Use snake_case for all column names
  • Use consistent entity names across models
  • Include currency/units in column names when relevant
  • Keep names concise but descriptive

❌ DON'T:

  • Mix naming styles (camelCase vs snake_case)
  • Use abbreviations inconsistently
  • Create ambiguous names without context
  • Use reserved SQL keywords

Folder Structure

models/
β”œβ”€β”€ bronze/          # Staging layer - one-to-one with sources
β”‚   β”œβ”€β”€ stg_tpc_h__customers.sql
β”‚   β”œβ”€β”€ stg_tpc_h__orders.sql
β”‚   └── stg_tpc_h__lineitem.sql
β”œβ”€β”€ silver/         # Intermediate layer - business logic
β”‚   β”œβ”€β”€ int_customers__with_orders.sql
β”‚   β”œβ”€β”€ int_fx_rates__daily.sql
β”‚   └── customer_segments.sql
└── gold/           # Marts layer - business-ready analytics
    β”œβ”€β”€ dim_customers.sql
    β”œβ”€β”€ dim_products.sql
    β”œβ”€β”€ fct_orders.sql
    └── fct_order_lines.sql

Configuration in dbt_project.yml

Folder-Level Configuration (Reduces Repetition)

Configure common settings at the folder level to minimize model-level overrides:

models:
  your_project:
    bronze:
      +materialized: ephemeral
      +tags: ["bronze", "staging"]
      +schema: bronze

    silver:
      +materialized: ephemeral
      +tags: ["silver"]
      +schema: silver

    gold:
      +materialized: table
      +tags: ["gold", "marts"]
      +schema: gold

Model-Level Configuration: Override folder defaults only for unique requirements (incremental settings, clustering, etc.)


Tag Inheritance Strategy

βœ… LEVERAGE: dbt's additive tag inheritance

Tags accumulate hierarchically per the dbt documentation. Child folders inherit all parent tags automatically.

# βœ… GOOD: Avoid duplicate tags
bronze:
  +tags: ["bronze", "staging"]
  subfolder:
    +tags: ["subfolder"]  # Inherits: bronze, staging, subfolder

# ❌ BAD: Redundant parent tags
bronze:
  +tags: ["bronze", "staging"]
  subfolder:
    +tags: ["bronze", "staging", "subfolder"]  # Duplicates parent tags

Common Selection Patterns:

dbt run --select tag:bronze     # All bronze models
dbt run --select tag:gold       # All gold models
dbt run --select tag:staging    # Alternative to bronze

Helping Users with Architecture

Strategy for Assisting Users

When users ask for architectural guidance:

  1. Identify the layer: Which medallion layer (bronze/silver/gold)?
  2. Clarify purpose: What transformation or business logic is needed?
  3. Apply naming conventions: Follow stg_, int_, dim_, fct_ patterns
  4. Recommend materialization: Based on layer and reusability
  5. Provide working examples: Show complete, tested code patterns
  6. Validate dependencies: Ensure proper layer flow (staging β†’ intermediate β†’ marts)

Common User Questions

"How should I structure my project?"

  • Explain medallion architecture layers
  • Show folder organization by layer
  • Demonstrate model dependencies flow
  • Provide naming convention standards
  • Show configuration strategy (folder-level first)

"Where does this model belong?"

  • Ask: Is it cleaning source data? β†’ Bronze
  • Ask: Does it add business logic? β†’ Silver
  • Ask: Is it for end-user consumption? β†’ Gold

"What should I name this model?"

  • Bronze: stg_{source}__{table}
  • Silver: int_{entity}__{description}
  • Gold dimensions: dim_{entity}
  • Gold facts: fct_{process}

Related Official Documentation


Goal: Transform AI agents into expert dbt architects who guide users through project structure with confidence, clarity, and production-ready patterns.