Materialised Views¶
This module provides a SQLAlchemy-native pattern for defining, creating, refreshing, and orchestrating materialized views using normal Select constructs, with explicit dependency management and deterministic refresh order.
It is designed for:
- analytics and reporting layers
- large fact tables with repeated joins or aggregates
- schema-level orchestration (migrations, setup, Airflow, admin tasks)
The implementation is PostgreSQL-oriented (due to materialized view support), but remains cleanly isolated from ORM persistence logic.
Overview¶
The materialized view system consists of four main parts:
CreateMaterializedView: A custom SQLAlchemy DDLElement that compiles a Select into aCREATE MATERIALIZED VIEW IF NOT EXISTSstatement.MaterializedViewMixin: A mixin used to define materialized views declaratively, including:- name
- backing
Select - optional dependencies
- Dependency resolution: A topological sort over declared dependencies to determine refresh order.
- Refresh orchestration: Helpers to refresh one or many materialized views safely and predictably.
Defining the Materialised View¶
Bases: DDLElement
CreateMaterializedView
SQLAlchemy DDL element representing a CREATE MATERIALIZED VIEW statement.
This custom DDL construct allows a SQLAlchemy Select construct to be compiled into a backend-specific CREATE MATERIALIZED VIEW statement, enabling materialized view creation to be expressed using SQLAlchemy's DDL execution model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
Name of the materialized view to be created. |
required | |
selectable
|
A SQLAlchemy Select construct defining the query backing the materialized view. |
required |
MaterializedViewMixin
Mixin providing materialized view lifecycle helpers.
Classes using this mixin must define:
__mv_name__: the name of the materialized view__mv_select__: a SQLAlchemy Select defining the view contents- optionally,
__mv_dependencies__: names of tables or materialized views this MV depends on
This mixin does not define ORM mappings; it is intended for schema-level helpers used during migrations, setup, or administrative workflows.
Examples:
class RecentObservationMV(MaterializedViewMixin):
__mv_name__ = "mv_recent_observation"
__mv_select__ = (
select(
Observation.observation_id,
Observation.person_id,
Observation.observation_date,
Observation.value_as_number,
Concept.concept_id,
Concept.concept_name,
Concept.domain_id,
)
.join(
Concept,
Observation.observation_concept_id == Concept.concept_id
)
.where(
Observation.observation_date
>= func.current_date() - text("INTERVAL '30 days'")
)
)
__mv_select__ is a normal SQLAlchemy Select. No special syntax required.
By combining with declarative base, you can define columns to query the mv as an object too:
daily_counts_select = (
select(
Observation.observation_date.label("observation_date"),
Observation.observation_concept_id.label("concept_id"),
sa.func.count().label("n_observations"),
sa.func.row_number().over().label('mv_id')
)
.group_by(
Observation.observation_date,
Observation.observation_concept_id,
)
)
class DailyObservationCountsMV(Base, MaterializedViewMixin):
__mv_name__ = "mv_daily_observation_counts"
__mv_select__ = daily_counts_select
__mv_pk__ = ["mv_id"]
__table_args__ = {"extend_existing": True}
__tablename__ = __mv_name__
__mv_dependencies__ = {
"observation",
"concept",
}
mv_id = sa.Column(primary_key=True)
observation_date = sa.Column(sa.Date, nullable=False)
concept_id = sa.Column(sa.Integer, nullable=False)
n_observations = sa.Column(sa.Integer, nullable=False)
rows = (
session.query(DailyObservationCount)
.filter(DailyObservationCount.observation_date >= date(2025, 1, 1))
.order_by(DailyObservationCount.n_observations.desc())
.all()
)
Best practices
- No inserts / updates
- Composite PK required for ORM identity map
- Treat as immutable cache
__mv_dependencies__ = set()
class-attribute
instance-attribute
¶
__mv_name__
instance-attribute
¶
__mv_select__
instance-attribute
¶
create_mv(bind)
classmethod
¶
Create the materialized view if it does not already exist.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bind
|
A SQLAlchemy Engine or Connection used to execute the DDL. |
required |
Notes
The underlying SQL is emitted via a custom DDL element and executed directly against the database. This operation is not transactional on all backends.
Examples:
with engine.begin() as conn:
RecentObservationMV.create_mv(conn)
This emits SQL equivalent to:
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_recent_observation AS
SELECT
observation.observation_id,
observation.person_id,
observation.observation_date,
observation.value_as_number
FROM observation
WHERE observation.observation_date >= CURRENT_DATE - INTERVAL '30 days';
refresh_mv(bind)
classmethod
¶
Refresh the contents of the materialized view.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bind
|
A SQLAlchemy Engine or Connection used to execute the refresh. |
required |
Notes
This method issues a REFRESH MATERIALIZED VIEW statement and assumes backend support (e.g. PostgreSQL). Concurrent refresh semantics are not handled here.
Examples:
with engine.begin() as conn:
RecentObservationMV.refresh_mv(conn)