Integration¶
Info
This guide showcases how to make your package support omop-config configure <package> and use oa-configurator for engine creation and logging.
Overview¶
Each package that integrates with oa-configurator:
- Subclasses
PackageConfigBasewith its typed config fields - Registers the class via an entry point in
pyproject.toml - Calls
MyPackageConfig.get_config()to read its config - Uses
Resolver.from_active_config().resolve_resource("default").create_engine()for SQLAlchemy - Calls
configure_logging(verbosity=verbose, extra_namespaces=["<package>"])at startup
Steps for integration¶
1: Add the dependency¶
In your pyproject.toml:
[project.dependencies]
"oa-configurator>=0.2.0" # version may vary
2. Define your config class¶
In src/<package>/config.py:
from typing import ClassVar
from pydantic import Field
from oa_configurator import PackageConfigBase
class MyPackageConfig(PackageConfigBase):
tool_name: ClassVar[str] = "my_package" # maps to [tools.my_package] in TOML
# Declare typed fields; they're backed by ToolConfig.extra in the TOML
backend: str = Field(default="default", description="Backend to use.")
data_path: str | None = Field(default=None, description="Path to local data files.")
get_config() is inherited from PackageConfigBase — call MyPackageConfig.get_config() to load from the active stack config. from_stack() reads the [tools.my_package.extra] section and validates it against your typed fields. If the section is missing, fields fall back to their defaults.
3. Register the entry point¶
[project.entry-points."omop.config"]
my_package = "my_package.config:MyPackageConfig"
After installing your package, omop-config configure my_package will find and prompt through your fields.
4. Engine creation¶
from oa_configurator import Resolver
engine = Resolver.from_active_config().resolve_resource("default").create_engine()
create_engine() applies the schema_translate_map automatically so OMOP ORM models route to the right schemas without changes.
For the vocabulary database:
vocab_engine = Resolver.from_active_config().resolve_resource("default").create_engine(role="vocab")
5. Logging¶
At your package's CLI entry point or startup:
from oa_configurator import configure_logging
# verbosity comes from the -v/-vv CLI flag count
configure_logging(verbosity=verbose, extra_namespaces=["my_package"])
Pass load_stack_config() as the first argument instead of verbosity= to use the [logging] block from the config file:
configure_logging(load_stack_config(), verbosity=verbose, extra_namespaces=["my_package"])
Testing¶
Tests fall into two tiers with different requirements.
Unit tests: StackConfig.for_session()¶
Unit and mock-based tests must never touch ~/.config/omop/config.toml. Use
StackConfig.for_session() with monkeypatch to inject a fully in-memory config:
from oa_configurator import StackConfig, Resolver
def test_something(monkeypatch):
cfg = StackConfig.for_session(
databases={"db": {"dialect": "sqlite", "database_name": ":memory:"}},
resources={"default": {"database": "db", "cdm_schema": "omop"}},
tools={"my_package": {"extra": {"backend": "test_backend"}}},
)
monkeypatch.setattr("my_package.module.load_stack_config", lambda: cfg)
# ... test against the in-memory config
This covers the vast majority of tests. No file I/O, no environment-specific setup needed.
for_session() + monkeypatch is NOT a fallback
for_session() + monkeypatch is for isolated unit tests where only config values matter, not config source. It MUST NEVER be used to paper over missing configuration in CI or local dev. If a code path needs get_config()/load_stack_config() to succeed at all, you
need to provide a real configuration for the test case.
Integration tests: dedicated test resource¶
For tests that exercise a real database (e.g. PostgreSQL-specific SQL, bulk loading, trigger management), use a dedicated named resource in the user's config and never a profile override of the production resource.
The canonical resource name is test_<package>_db (e.g. test_cdm_db for omop-alchemy).
Keeping the name distinct from the production resource (cdm_db) is a mandatory safety guard:
the test suite must never accidentally connect to a production database.
In conftest.py, resolve the test resource via the resolve_test_resource pytest-plugin
helper, which skips cleanly whether config.toml is entirely missing or simply
doesn't have this resource configured yet:
@pytest.fixture(scope="session")
def pg_engine():
from oa_configurator.pytest_plugin import resolve_test_resource
from my_package.config import MyPackageConfig
url = resolve_test_resource(MyPackageConfig.TEST_DB)
engine = sa.create_engine(url, future=True)
yield engine
engine.dispose()
Why not OA_ACTIVE_PROFILE=test? Setting OA_ACTIVE_PROFILE globally in conftest.py
affects every test that calls load_stack_config(), including unit tests that monkeypatch it.
A dedicated resource name scopes the real-DB resolution to only the fixture that needs it, and
leaves cdm_db unambiguously pointing at production data throughout the test session.
Provisioning the test resource¶
The test resource must be provisioned for real before pytest runs. There is no fallback that
papers over a missing one, by design (see the callout above). omop-config configure <package>
accepts --test-* flags (mirroring every owned-resource flag, e.g. --test-host, --test-port,
--test-database-name) for exactly this. The test- prefix is fixed and the same for every
package; it is not configurable per package.
Run omop-config configure <package> and answer Y when asked to configure a
test database resource, or pass --test-* flags directly for the same one-shot, non-interactive
result CI uses.
pass --test-* flags as part of the same omop-config configure <package> step that
configures the package's owned resource (if it has one):
omop-config configure <db-package> \
--test-dialect postgresql+psycopg --test-database test_cdm \
--test-host localhost --test-port 5432 --test-cdm-schema public \
--test-user test --test-password test --test-database-name test_db
--test-* flags work independently of the owned resource flags. This
means that we can just provide the test flags if the resource itself is
not needed in the CI but the test configuration is.
Safety
The test resource must point to a dedicated, empty database.
If your test session drops and recreates schemas, add a runtime guard that compares the
resolved URL of test_cdm_db against all other configured resources and calls
pytest.fail() on any match.
Multiple environments¶
Adding a second resource of the same type¶
The omop-config configure command creates one resource per semantic name by default
(e.g. cdm_db for omop-alchemy). To add a second — for example a production CDM
alongside a local development one — use --resource-name:
omop-config configure omop_alchemy --resource-name cdm_db_prod
This creates cdm_db_prod without touching the existing cdm_db. Because two
resources now exist for the same package, configure automatically prompts you to
choose the default at the end of the same run — no second invocation needed.
To change the default later, edit config.toml directly:
[tools.omop_alchemy]
default_resource = "cdm_db_prod"
When does default_resource matter?¶
default_resource is only consulted when you have multiple resources of the same type
configured. With a single CDM resource and a single EMB resource, it is set
automatically by configure and you never need to think about it.
resource_aliases — a naming compatibility tool¶
resource_aliases is NOT for switching environments. It solves a different problem:
if your resource is named differently from what packages expect (e.g. my_hospital_cdm
instead of cdm_db), aliases let packages find it without renaming your resource:
[resource_aliases]
cdm_db = "my_hospital_cdm" # packages asking for cdm_db get my_hospital_cdm
For multi-environment use, create multiple named resources and use default_resource
instead.
Docker Compose¶
How it works¶
~/.config/omop/config.toml lives on the host (or in a container's home directory) and is
the single source of truth. Docker Compose is only needed to provide database credentials
at container startup. The app itself always reads from the TOML file, never from environment
variables at runtime.
The workflow:
- A gitignored
.envfile holds secrets that Docker Compose substitutes into its YAML. - The container's startup command calls
omop-config configure <package>with--flags, writing those values intoconfig.tomlonce at startup. - After that, the app reads
config.tomlnormally without any environment variables involved.
The .env file is a Docker Compose concern only. It is never loaded by the Python app.
Example¶
.env (gitignored):
POSTGRES_USER=omop
POSTGRES_PASSWORD=secret
POSTGRES_DB=omop_cdm
docker-compose.yml:
services:
db:
image: postgres:16
environment:
POSTGRES_USER: ${POSTGRES_USER}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
POSTGRES_DB: ${POSTGRES_DB}
app:
build: .
environment:
POSTGRES_USER: ${POSTGRES_USER}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
POSTGRES_DB: ${POSTGRES_DB}
command: >
bash -c "
omop-config configure my_package
--database cdm --dialect postgresql+psycopg
--host db --port 5432
--user $$POSTGRES_USER --password $$POSTGRES_PASSWORD
--database-name $$POSTGRES_DB --cdm-schema omop &&
exec my_app_entrypoint
"
$$VAR escaping
Use $$VAR (double dollar) inside a command: string so Docker Compose passes the
literal variable name to the shell rather than substituting it at YAML-parse time.
If your stack has more than one package (e.g., my_package and omop_alchemy), add a
separate omop-config configure call for each, chained with &&:
command: >
bash -c "
omop-config configure omop_alchemy
--database cdm --dialect postgresql+psycopg
--host db --port 5432
--user $$POSTGRES_USER --password $$POSTGRES_PASSWORD
--database-name $$POSTGRES_DB --cdm-schema omop &&
omop-config configure my_package
--database cdm --dialect postgresql+psycopg
--host db --port 5432
--user $$POSTGRES_USER --password $$POSTGRES_PASSWORD
--database-name $$POSTGRES_DB --cdm-schema omop &&
exec my_app_entrypoint
"
Each call is scoped to its own package. The --host value for omop_alchemy
configures the CDM database; the --host value for my_package configures that
package's database. No prefix is needed because the package name is the namespace.
Security note¶
The config.toml written by the container will contain the database password in plaintext.
This is acceptable for local development containers. Restrict the file permissions:
chmod 600 ~/.config/omop/config.toml
omop-config will warn at load time if the file has looser permissions.
TOML snippet for your README¶
Add a Configuration section to your package README:
## Configuration
Requires `omop-config` to be run once. See the
[`oa-configurator` quickstart](link) for initial setup.
Add the following to `~/.config/omop/config.toml`:
\`\`\`toml
[tools.my_package]
default_resource = "default"
[tools.my_package.extra]
backend = "default"
data_path = "/path/to/data"
\`\`\`
Then run:
\`\`\`bash
omop-config configure my_package
\`\`\`