OMOP Embeddings
omop-emb is an optional package to super-charge omop-graph and provide additional graph reasoning tools for information retrieval and RAG-based knowledge extraction.
The package currently supports:
- dynamic embedding model registration
- multiple embedding models can be stored in the respective backend
- embedding and lookup for OMOP concepts
- supports various backends with a PostgreSQL linker
- pgvector: storage in the original OMOP database
- FAISS: efficient storage on disk for low-RAM applications
- Extension to
omop-alchemyto support new tables - CLI scripts to add embeddings to an already existing OMOP CDM
Installation
Install the backend you actually want to use:
pip install "omop-emb[postgres]"
pip install "omop-emb[faiss]"
pip install "omop-emb[all]"
A plain pip install omop-emb installs only the shared core package.
At runtime, backend choice should also be explicit. The intended direction is:
- install-time choice via extras
- runtime choice via config such as
OMOP_EMB_BACKEND=postgresorOMOP_EMB_BACKEND=faissor passing it as an argument to the respective interface (e.g. see CLI reference)
Important caveats
omop-embdepends on an OMOP PostgreSQL database for storage of embeddings (pgvector) or to keep track of already embedded concepts.