Interface Guide
omop-emb exposes two complementary Python interfaces:
EmbeddingWriterInterface— write + read. Wraps anEmbeddingClientfor embedding generation, model registration, and upsert.EmbeddingReaderInterface— read-only. NoEmbeddingClientneeded; nearest-neighbour queries and registry lookups only.
Both interfaces accept a pre-constructed EmbeddingBackend (sqlite-vec or pgvector) and validate model names via the configured provider.
Constructing a backend
Resolve the active backend from environment variables using resolve_backend:
from omop_emb.backends import resolve_backend
backend = resolve_backend() # reads OMOP_EMB_BACKEND + connection variables
Or construct one directly:
from omop_emb.backends.sqlitevec import SQLiteVecEmbeddingBackend
from omop_emb.backends.pgvector import PGVectorEmbeddingBackend
# sqlite-vec
backend = SQLiteVecEmbeddingBackend.from_path(db_path="/data/omop_emb.db")
# pgvector
backend = PGVectorEmbeddingBackend.from_db_url(db_url="postgresql+psycopg://user:pass@host:5432/db")
EmbeddingWriterInterface
Creating the interface
from omop_emb import EmbeddingWriterInterface, EmbeddingClient
from omop_emb.config import MetricType
embedding_client = EmbeddingClient(
model="nomic-embed-text:v1.5",
api_base="http://localhost:11434/v1",
)
writer = EmbeddingWriterInterface(
backend=backend,
metric_type=MetricType.COSINE,
embedding_client=embedding_client,
omop_cdm_engine=cdm_engine, # optional; required for embed_and_upsert_concepts
)
Register and initialise
from omop_emb.backends.index_config import FlatIndexConfig
# Always register with FLAT first
writer.register_model() # uses FlatIndexConfig() by default
writer.register_model(index_config=FlatIndexConfig()) # explicit equivalent
register_model is idempotent — calling it when the model is already registered
is safe and returns the existing record.
Generate and store embeddings
# Generate embeddings from CDM concepts and upsert in one step.
# omop_cdm_engine is used to fetch domain_id, vocabulary_id, standard_concept,
# and invalid_reason from the CDM and store them as filter metadata.
writer.embed_and_upsert_concepts(
omop_cdm_engine=cdm_engine,
concept_ids=(1, 2, 3),
concept_texts=("Hypertension", "Diabetes mellitus", "Aspirin"),
)
Asymmetric embedding models
embed_and_upsert_concepts always applies the document role, and
get_nearest_concepts_from_query_texts always applies the query role.
When calling embed_texts directly you must pass embedding_role explicitly.
See Asymmetric Embeddings for task prefix configuration.
Build an HNSW index
After all embeddings are ingested, optionally upgrade to an approximate index:
from omop_emb.backends.index_config import HNSWIndexConfig
from omop_emb.config import MetricType
writer.rebuild_index(
index_config=HNSWIndexConfig(
metric_type=MetricType.COSINE,
num_neighbors=16,
ef_construction=64,
ef_search=16,
)
)
This is equivalent to running omop-emb maintenance rebuild-index --index-type hnsw from the CLI.
EmbeddingReaderInterface
Use this when you only need to query stored embeddings — no embedding generation,
no EmbeddingClient required.
from omop_emb import EmbeddingReaderInterface
from omop_emb.config import MetricType, ProviderType
reader = EmbeddingReaderInterface(
model="nomic-embed-text:v1.5",
backend=backend,
metric_type=MetricType.COSINE,
provider_name_or_type=ProviderType.OLLAMA,
omop_cdm_engine=cdm_engine, # optional; enriches results with concept_name
)
Query nearest concepts
import numpy as np
from omop_emb.utils.embedding_utils import EmbeddingConceptFilter
query_vec = np.array([[...]], dtype=np.float32) # shape (Q, D)
results = reader.get_nearest_concepts(
query_embedding=query_vec,
k=10,
concept_filter=EmbeddingConceptFilter(
require_standard=True,
domains=("Condition", "Drug"),
require_active=True,
),
)
# results: tuple[tuple[NearestConceptMatch, ...], ...] — one inner tuple per query row
Query by text
from omop_emb import EmbeddingClient
embedding_client = EmbeddingClient(
model="nomic-embed-text:v1.5",
api_base="http://localhost:11434/v1",
)
results = reader.get_nearest_concepts_from_query_texts(
query_texts=("high blood pressure", "type 2 diabetes"),
embedding_client=embedding_client,
k=5,
)
FAISS fast path
Supply faiss_cache_dir to route searches through a pre-exported FAISS index
instead of the primary backend SQL path. The cache must have been exported
first with omop-emb maintenance export-faiss-cache. Requires omop-emb[faiss-cpu].
reader = EmbeddingReaderInterface(
model="nomic-embed-text:v1.5",
backend=backend,
metric_type=MetricType.COSINE,
provider_name_or_type=ProviderType.OLLAMA,
faiss_cache_dir="/data/faiss_cache",
)
# Searches automatically use FAISS when the cache is fresh; SQL path otherwise.
The environment variable OMOP_EMB_FAISS_CACHE_DIR is checked as a fallback
when faiss_cache_dir is not passed directly.
EmbeddingConceptFilter
EmbeddingConceptFilter is an in-database pre-filter applied during KNN search.
All filtering happens before the nearest-neighbour step — only matching concepts
are candidates.
from omop_emb.utils.embedding_utils import EmbeddingConceptFilter
concept_filter = EmbeddingConceptFilter(
domains=("Condition", "Observation"), # restrict to specific OMOP domains
vocabularies=("SNOMED", "ICD10CM"), # restrict to specific vocabularies
concept_ids=(313217, 4329847), # restrict to specific concept IDs
require_standard=True, # standard_concept = 'S' or 'C'
require_active=True, # invalid_reason NOT IN ('D', 'U')
limit=20, # cap on results returned
)
All fields are optional and combinable. require_standard and require_active
are stored as columns in the embedding table and are resolved entirely inside the
primary backend — no CDM round-trip at query time.
EmbeddingClient and providers
Note
Currently, only OllamaProvider is supported.
EmbeddingClient wraps any OpenAI-compatible endpoint. It canonicalises the
model name at construction time and exposes canonical_model_name as the stable
identifier used in the registry.
from omop_emb import EmbeddingClient, OllamaProvider
# Ollama — provider inferred from URL
client = EmbeddingClient(
model="nomic-embed-text:v1.5",
api_base="http://ollama:11434/v1",
)
# Explicit provider (custom or future backends)
client = EmbeddingClient(
model="nomic-embed-text:v1.5",
api_base="http://my-custom-host/v1",
provider=OllamaProvider(),
)
print(client.canonical_model_name) # "nomic-embed-text:v1.5"
print(client.embedding_dim) # auto-discovered based on provider
Model name validation
Valid names
Ollama:
- ✅
nomic-embed-text:v1.5 - ✅
llama3:8b - Any name with an explicit, immutable tag
Invalid names (raise ValueError)
Ollama:
- ❌
llama3— "must include an explicit tag" - ❌
llama3:latest— "uses the mutable ':latest' tag"
Info
Why the strictness? In long-term healthcare data storage, :latest is a
moving target. Running ollama pull llama3 silently changes which model
version :latest points to, breaking consistency between stored embeddings
and new query embeddings.
Utility functions
from omop_emb import list_registered_models
from omop_emb.config import ProviderType
models = list_registered_models(
backend=backend,
provider_type=ProviderType.OLLAMA, # optional filter
)
for m in models:
print(m.model_name, m.provider_type, m.dimensions, m.index_type)
Architecture
┌─────────────────────────────────────────────────────┐
│ Your Application Code │
└──────────────┬──────────────────────────────────────┘
│
┌──────┴──────────────────┐
│ │
▼ ▼
┌───────────────────┐ ┌──────────────────────┐
│ EmbeddingWriter │ │ EmbeddingReader │
│ Interface │ │ Interface │
│ (write + read) │ │ (read only) │
└───────┬───────────┘ └──────────┬───────────┘
│ │
▼ │
┌───────────────────┐ │
│ EmbeddingClient │ │
│ + Provider │ │
└───────────────────┘ │
│ │
└──────────┬───────────────┘
│
┌────────┴────────┐
│ Backend │
│ sqlite-vec │
│ pgvector │
└────────┬────────┘
│ (optional fast path)
┌────────┴────────┐
│ FAISS sidecar │
│ (read-only) │
└─────────────────┘
EmbeddingWriterInterface inherits from EmbeddingReaderInterface — all reader
methods are available on the writer too.
Best practices
- Use the interfaces, not backends directly — they enforce canonical naming.
EmbeddingWriterInterfacefor write flows,EmbeddingReaderInterfacefor query-only services.- Use
embedding_client.canonical_model_namewhen constructing a matching reader — it is guaranteed to be canonical. - Always register with
FlatIndexConfigfirst. Runrebuild_indexoromop-emb maintenance rebuild-indexafter ingestion to build HNSW. - CDM enrichment is optional — omit
omop_cdm_enginewhenconcept_nameis not needed to avoid the CDM round-trip. - FAISS is a read-acceleration sidecar — export with
omop-emb maintenance export-faiss-cacheand supplyfaiss_cache_dirtoEmbeddingReaderInterfacefor faster approximate search.