Skip to content

OMOP Embeddings

omop-emb is an optional package to super-charge omop-graph and provide additional graph reasoning tools for information retrieval and RAG-based knowledge extraction.

The package currently supports:

  • dynamic embedding model registration
  • multiple embedding models can be stored in the respective backend
  • embedding and lookup for OMOP concepts
  • supports various backends with a PostgreSQL linker
  • pgvector: storage in the original OMOP database
  • FAISS: efficient storage on disk for low-RAM applications
  • Extension to omop-alchemy to support new tables
  • CLI scripts to add embeddings to an already existing OMOP CDM

Installation

Install the backend you actually want to use:

pip install "omop-emb[postgres]"
pip install "omop-emb[faiss]"
pip install "omop-emb[all]"

A plain pip install omop-emb installs only the shared core package.

At runtime, backend choice should also be explicit. The intended direction is:

  • install-time choice via extras
  • runtime choice via config such as OMOP_EMB_BACKEND=postgres or OMOP_EMB_BACKEND=faiss or passing it as an argument to the respective interface (e.g. see CLI reference)

Important caveats

  • omop-emb depends on an OMOP PostgreSQL database for storage of embeddings (pgvector) or to keep track of already embedded concepts.

Documentation overview