Start from the business event, not the table list
The fastest way to derail a star schema is to begin by mirroring source systems. In dbt, start with the business event you want to measure instead: orders placed, invoices paid, sessions started, claims submitted.
That event becomes the fact table candidate. Once the event is clear, the rest of the model becomes easier to judge: which dimensions describe the event, which attributes belong in the fact, and which transformations belong upstream in staging or intermediate models.
Mental Model
Think of the fact table as a business ledger and each dimension as a lookup table. Every row in the ledger records a transaction. Every dimension row enriches that transaction with context — who, what, where, when.
- Write the fact grain in one sentence before you create any models.
- List the measures the business expects from that grain.
- Separate descriptive attributes from additive metrics early.
Use staging models to normalize warehouse noise
Raw tables usually contain naming drift, duplicated status logic, mixed timestamp semantics, and denormalized blobs that make dimensional modeling harder than it needs to be. dbt staging models are where you clean that noise.
Rename columns into warehouse-friendly terms, standardize types, and push source-specific cleanup into the staging layer. Your dimensional models should read like analytics products, not ingestion artifacts.
-- Staging: rename, retype, clean. No business logic here.
with source as (
select * from {{ source('raw', 'orders') }}
),
renamed as (
select
order_id,
customer_id,
product_id,
store_id,
cast(order_date as date) as order_date,
cast(quantity as integer) as quantity,
cast(unit_price as numeric(10,2)) as unit_price,
quantity * unit_price as total_amount,
lower(trim(status)) as order_status
from source
where order_id is not null
)
select * from renamed- Keep staging close to the source — avoid business metrics there.
- Normalize keys and timestamps before dimensional modeling starts.
- Document field meaning in the same repo where models are reviewed.
Design conformed dimensions before you optimize marts
A star schema scales better when dimensions are reusable across facts. Customer, product, account, and date are common examples. If each mart invents its own version, reviews get noisy and definitions drift across the warehouse.
In dbt, conformed dimensions also reduce duplicate logic. Shared dimensions feed multiple marts while preserving a single, reviewable source of truth for descriptive attributes.
- Prefer shared dimensions when the entity definition is stable across domains.
- Keep slowly changing history intentional, not accidental.
- Use surrogate keys only when the modeling problem genuinely requires them.
Express your schema in SchemaStruct DSL before writing SQL
Before writing any dbt SQL, expressing your star schema in a structured DSL lets the whole team review the design as a diagram. You can catch grain mismatches, missing foreign keys, and naming inconsistencies at design time instead of during code review.
SchemaStruct parses the DSL into a live diagram. Once the design is agreed, the DSL exports directly to SQL DDL for any target dialect — Snowflake, PostgreSQL, or Databricks.
table dim_customer {
customer_id int [pk]
full_name varchar
email varchar [unique]
region varchar
segment varchar
note: "Conformed — shared across fct_orders and fct_sessions"
}
table dim_product {
product_id int [pk]
product_name varchar
category varchar
sku varchar [unique]
unit_cost decimal
}
table dim_date {
date_id int [pk]
full_date date
year int
quarter int
month int
day_of_week varchar
is_weekend boolean
}
table fact_orders {
order_id int [pk]
customer_id int [ref: > dim_customer.customer_id]
product_id int [ref: > dim_product.product_id]
date_id int [ref: > dim_date.date_id]
quantity int
unit_price decimal
total_amount decimal
}Review the model before you ship SQL
Teams often treat dbt as the first review surface, but that forces reviewers to reason through the implementation before agreeing on the structure. A better approach is to review the schema itself first: grain, joins, dimensions, and delivery outputs.
That is where a visual review layer compounds. Once the star schema is agreed, the dbt implementation becomes a handoff step instead of the place where design ambiguity is discovered.
Pro Tip
Ship the schema review, not just the SQL review. Agreeing on grain and joins before writing dbt models cuts downstream rework by catching design issues at the cheapest possible moment — before any SQL exists.