Layered Architecture¶

OMOP Alchemy is built as a deliberately layered system.

Each layer adds capability while preserving the guarantees of the layer below it. Responsibilities flow downward; semantic intent flows upward.

The result is a system that is:

composable
inspectable
safe for both ETL and analytics

The Layer Stack¶

flowchart TD subgraph L0["orm-loader"] L0a["CSVLoadableTableInterface"] L0b["SerialisableTableInterface"] L0c["Bulk load & casting helpers"] end subgraph L1["cdm.base"] L1a["CDMTableBase"] L1b["Column helpers"] L1c["Structural mixins"] L1d["ReferenceContext"] L1e["DomainValidation primitives"] end subgraph L2["cdm.models"] L2a["Concrete CDM tables"] L2b["@cdm_table"] L2c["OMOP-compliant schemas"] end subgraph L3["Views & Contexts"] L3a["Reference contexts"] L3b["Derived properties"] L3c["Hybrid expressions"] end subgraph L4["Semantic Validation"] L4a["ExpectedDomain"] L4b["DomainRule"] L4c["Runtime domain checks"] end L0 --> L1 L1 --> L2 L2 --> L3 L3 --> L4

Layer responsibilities¶

orm-loader (L0)¶

Purpose: ingestion and infrastructure

This layer provides:

CSV loading
bulk inserts
type casting
serialization helpers

It is deliberately domain-agnostic.

If something understands OMOP concepts, vocabularies, or clinical meaning, it does not belong here.

Examples:

cdm.base (L1)¶

Purpose: structural OMOP semantics

This layer encodes:

common OMOP table structure
column patterns (required vs optional)
reusable mixins
reference relationship mechanics
domain validation primitives

This is where OMOP’s shape lives, but not its analytical meaning.

Examples:

This layer answers:

“What does a valid OMOP table look like?”

cdm.models (L2)¶

Purpose: concrete OMOP tables

This layer defines:

actual CDM tables
exact column layouts
primary keys and foreign keys
official OMOP schemas

Classes in this layer:

are safe for ETL
are safe for bulk loading
avoid eager relationships
avoid analytical helpers

Examples:

Person

Views & Contexts (L3)¶

Purpose: navigation and analysis

This layer adds:

reference relationships
derived properties
hybrid expressions
query-friendly helpers

Examples:

PersonContext PersonView

Views are designed for interactive use, not ingestion.

Tables are for pipelines. Views are for people.

Semantic Validation (L4)¶

Purpose: make semantic expectations explicit

This layer introduces:

declared domain expectations
inspectable rules
runtime, advisory checks

Examples:

ExpectedDomain

This layer answers:

“Does this object reference the kinds of concepts I think it does?”

Validation here is:

non-blocking
non-mutating
safe to skip
safe to run interactively

Directional guarantees¶

The stack is intentionally one-directional:

lower layers never import higher layers
ETL code never depends on analytical helpers
validation never mutates state

This guarantees:

predictable ingestion
expressive analysis
minimal coupling
documentation that stays aligned with code

Why this separation matters¶

Implementations that collapse these concerns into a single ORM layer will tend to produce:

accidental joins in ETL
brittle analytical code
hard-to-debug semantics
documentation drift