Data Warehouse Design: Architecture, Schemas, and Where to Actually Start

Sandy DareApril 20, 2026April 20, 2026No tags

A data warehouse is a centralized repository that stores large volumes of structured, historical data from multiple sources. Unlike a standard database built for daily transactions, a “data warehouse design“ is optimized specifically for analysis and business intelligence. By separating analytical workloads from operational ones, companies can run complex queries across massive datasets to identify long-term trends without slowing down their primary applications.

The design decisions you make early – schema type, ETL strategy, layer architecture – will determine whether your warehouse stays performant and maintainable at scale or becomes a source of technical debt within two years.

Data Warehouse Architecture: The Three Layers

Layer	Name	Purpose	Tools
Source Layer	Raw / Staging	Raw data ingested from source systems as-is	Fivetran, Airbyte, Stitch
Storage Layer	Data Warehouse Core	Cleaned, modeled, organized data	Snowflake, BigQuery, Redshift
Presentation Layer	Data Marts / Reports	Subject-specific views for end users	dbt, Looker, Tableau

A well-designed warehouse keeps these layers clearly separated. The raw layer preserves original data (critical for debugging and reprocessing). The core layer applies business logic. The presentation layer delivers curated views to business users.

Schema Design: Star vs Snowflake vs Data Vault

This is where most warehouse design conversations begin – and where many teams get stuck.

Schema	Structure	Pros	Cons	Best For
Star Schema	Fact table + denormalized dimensions	Simple queries; fast aggregations; easy for BI tools	Data redundancy in dimensions	Most analytics workloads
Snowflake Schema	Fact table + normalized dimensions	Less redundancy; consistent hierarchies	More joins; harder for non-technical users	Complex hierarchies, strict normalization
Data Vault	Hubs, Links, Satellites	Highly auditable; handles schema changes well	Complex; steep learning curve	Enterprise DWH; regulatory compliance

The practical recommendation: Start with a star schema. The query simplicity and BI tool compatibility outweigh the normalization benefits of snowflake for most teams. Move to Data Vault only if audit requirements or extremely complex historical tracking demands it.

The Fact Table: Heart of the Star Schema

The fact table stores measurable business events – sales transactions, website clicks, inventory movements. Each row represents one event.

What goes in a fact table:

Foreign keys to dimension tables
Numeric measures (revenue, quantity, duration)
Date keys (joining to date dimension)

What doesn’t belong:

Descriptive attributes (those go in dimensions)
Text fields (bad for aggregation)
Calculated fields that can be derived at query time

Dimension Tables: The Context Around Facts

Dimension tables provide the “who, what, where, when” context for your facts.

Dimension	What It Describes	Example Columns
Date dimension	Calendar hierarchy	Date, day, month, quarter, year, fiscal period
Customer dimension	Customer attributes	Name, segment, region, join date
Product dimension	Product attributes	Name, category, SKU, price
Geography dimension	Location hierarchy	City, state, country, region

Critical concept: Slowly Changing Dimensions (SCD)

What happens when a customer moves to a different state or a product changes its category? How you handle this determines whether your historical data is accurate:

SCD Type	How It Works	When to Use
Type 1	Overwrite old value	History doesn’t matter
Type 2	Add new row with effective dates	Need full historical accuracy
Type 3	Add new column for current/previous	Only current and one prior state needed

Type 2 is the most common – it preserves history by adding a new dimension row with start/end dates whenever an attribute changes.

Modern Data Warehouse Stack (2024-2025)

Component	Leading Tools
Data ingestion (ELT)	Fivetran, Airbyte, dbt
Storage and compute	Snowflake, BigQuery, Databricks, Redshift
Transformation layer	dbt (industry standard for transformations)
Orchestration	Airflow, Prefect, Dagster
BI / visualization	Tableau, Looker, Power BI, Metabase
Data catalog	dbt docs, Atlan, Alation

The modern shift is from ETL (transform before loading) to ELT (load raw, then transform in the warehouse). Cloud warehouses have the compute power to transform at scale – moving the transformation logic into dbt models rather than pre-warehouse pipelines.

Common Data Warehouse Design Mistakes

Mistake	What Happens	Fix
Skipping the raw/staging layer	No source data to reprocess when logic changes	Always land raw data first
One giant fact table	Unmanageable; mixed granularity	One fact table per business process
No date dimension	Date-based queries become painful	Pre-build a date dimension spanning years
Business logic in the BI layer	Reports become inconsistent across tools	Define metrics in the warehouse / dbt
Ignoring grain definition	Queries return wrong aggregations	Define exactly what one row represents

The Bottom Line

Data warehouse design is fundamentally about separating concerns: raw data from transformed data, facts from dimensions, source logic from business logic. Start with a star schema, build a robust staging layer, adopt dbt for transformations, and define your grain before writing a single table. The teams that get this right early avoid the painful rewrites that slow every analytics team down at scale.

Data Warehouse Design: Architecture, Schemas, and Where to Actually Start

Data Warehouse Architecture: The Three Layers

Schema Design: Star vs Snowflake vs Data Vault

The Fact Table: Heart of the Star Schema

What goes in a fact table:

What doesn’t belong:

Dimension Tables: The Context Around Facts

Critical concept: Slowly Changing Dimensions (SCD)

Modern Data Warehouse Stack (2024-2025)

Common Data Warehouse Design Mistakes

The Bottom Line

How Experts Complete Keyless Entry Installation in Residential Homes across Rockford

Transform Your Cincinnati Landscape with Sustainable Bulk Mulch

Avoid Costly Repairs with Professional Sewer Replacement Services

10 Common Mistakes To Avoid When Selecting A Dust Collector

Ways Proper HVAC Parts Affect Heating And Cooling Balance In Birmingham

Amateur Outdoor Porn Videos Collection

Key Considerations When Bringing Pets to Lakefront Vacation Rentals

What does TDS automation in large HR systems deliver for finance?

How do lottery entry windows remain predictable?

Data Warehouse Architecture: The Three Layers

Schema Design: Star vs Snowflake vs Data Vault

The Fact Table: Heart of the Star Schema

What goes in a fact table:

What doesn’t belong:

Dimension Tables: The Context Around Facts

Critical concept: Slowly Changing Dimensions (SCD)

Modern Data Warehouse Stack (2024-2025)

Common Data Warehouse Design Mistakes

The Bottom Line

You Might Also Like