KnowledgeGraph
omop_graph.graph.kg
OMOP-backed graph facade.
This module provides the KnowledgeGraph class, which acts as the primary
interface (facade) to the OMOP Common Data Model database.
Responsibilities
- SQLAlchemy Access: Manages the database session and executes queries.
- Caching: Implements LRU caching for high-frequency lookups (concepts, predicates).
- Predicate Semantics: Resolves relationship IDs to
Predicateobjects and Kinds. - Edge/Node Retrieval: Provides methods to traverse the graph (parents, children, edges).
KnowledgeGraph
Bases: GraphBackend
The main entry point for interacting with the OMOP Graph.
This class wraps a SQLAlchemy session and provides high-level methods to query concepts, relationships, and metadata.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
session_factory
|
sessionmaker
|
The SQLAlchemy sessionmaker factory capable of creating separate sessions for each database access. |
required |
emb
property
Namespace for all embedding operations.
Returns EmbeddingInterface if _emb_client is set (for write operations), otherwise returns EmbeddingReader (for read-only operations).
The interface/reader is created lazily on first access using _emb_backend and
_emb_base_storage_dir. Backend resolution follows omop_emb rules:
explicit backend argument first, then OMOP_EMB_BACKEND.
children(concept_id)
cached
Retrieve children Concept IDs of concept using Concept_Ancestor table.
clear_caches()
Clear all LRU caches associated with the graph.
concept_id_by_code(vocabulary_id, concept_code)
cached
Look up a Concept ID using the vocabulary ID and concept code.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
vocabulary_id
|
str
|
The vocabulary ID (e.g., 'SNOMED', 'RxNorm'). |
required |
concept_code
|
str
|
The source code within that vocabulary. |
required |
Returns:
| Type | Description |
|---|---|
int
|
The resolved OMOP Concept ID. |
concept_ids_by_label(label)
cached
Find concept IDs that match the label exactly (case-insensitive).
concept_lookup(label, match_kind, synonym=False, search_constraint=None, sort=True)
cached
Resolve a label to concept_id(s).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
label
|
str
|
The term to search for. |
required |
match_kind
|
LabelMatchKind
|
The kind of match to perform (exact, fulltext, partial). |
required |
synonym
|
bool
|
If True, searches in Concept_Synonym instead of Concept. |
False
|
search_constraint
|
SearchConstraintConcept
|
Additional filters for domain/vocabulary. |
None
|
concept_view(concept_id)
cached
Retrieve a single concept view by ID.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
concept_id
|
int
|
The OMOP Concept ID. |
required |
Returns:
| Type | Description |
|---|---|
ConceptView
|
The immutable view of the concept. |
concept_views(concept_ids, sort=True)
cached
Retrieve multiple concept views in a batch.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
concept_ids
|
tuple[int, ...]
|
A tuple of OMOP Concept IDs. |
required |
Returns:
| Type | Description |
|---|---|
tuple[ConceptView, ...]
|
A tuple of concept views. |
edges(concept_ids, direction, predicate_ids=None, predicate_kinds=None, active_only=True, on=None, within_domain=True)
cached
Convenience method to retrieve all edges from one or multiple concepts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
concept_ids
|
(int, tuple[int, ...])
|
The source/target concept ID(s). |
required |
direction
|
str
|
'out' for outgoing, 'in' for incoming. |
required |
predicate_ids
|
frozenset[str]
|
Filter by specific relationship IDs. |
None
|
predicate_kinds
|
Set[ClassIDEnum]
|
Filter by semantic kind of relationship. |
None
|
active_only
|
bool
|
If True, return only valid/active edges. |
True
|
on
|
date
|
Check validity on a specific date. |
None
|
within_domain
|
bool
|
If True, only return edges where source/target domains match. |
True
|
get_all_concept_domain_ids()
Retrieve all distinct Domain IDs present in the concept table.
get_all_concept_vocabulary_ids()
Retrieve all distinct Vocabulary IDs present in the concept table.
get_num_ancestors(concept_ids)
Get the count of ancestors for a batch of concepts.
get_potential_ancestor(child_id, parent_id)
Check if an ancestry relationship exists between a child and parent.
leaves(domain_id=None, vocabulary_id=None)
cached
Retrieve leaf concepts (no children).
parents(concept_id)
cached
Retrieve parent Concept IDs of concept using Concept_Ancestor table.
predicate(relationship_id)
cached
Retrieve a Predicate object by its relationship ID.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
relationship_id
|
str
|
The OMOP relationship ID (e.g., 'maps to'). |
required |
Returns:
| Type | Description |
|---|---|
Predicate
|
The predicate definition. |
predicate_kind(relationship_id)
Classify the predicate into a semantic kind.
predicate_kinds(relationship_ids)
Classify a batch of predicates.
predicate_name(relationship_id)
cached
Retrieve the human-readable name of a relationship.
predicates()
Return all predicates known to the knowledge graph.
relationships(session, subjects, predicates, objects, invert=False)
Query relationships between concepts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
subjects
|
list[CURIE] | None
|
List of subject CURIEs. |
required |
predicates
|
list[str] | None
|
List of predicate (relationship) IDs. |
required |
objects
|
list[CURIE] | None
|
List of object CURIEs. |
required |
invert
|
bool
|
If True, swaps subjects and objects in the query and result. |
False
|
Yields:
| Type | Description |
|---|---|
Tuple[CURIE, PRED_CURIE, CURIE]
|
Triples (subject, predicate, object). |
reverse_predicate_id(relationship_id)
Get the reverse relationship ID, if it exists.
rollback_session()
Safely rollback the session if in a pending state.
roots(domain_id=None, vocabulary_id=None)
cached
Retrieve root concepts (no parents).
singletons(domain_id=None, vocabulary_id=None)
cached
Retrieve singleton concepts (no parents and no children).
specificity(concept_id)
Compute specificity as the inverse of out-degree. Higher is more specific.
synonyms_for_concept(concept_id)
cached
Retrieve all synonyms for a concept.
KnowledgeGraphEmbeddingConfiguration
dataclass
Configuration for embedding-based operations in the knowledge graph.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
backend_type
|
EmbeddingBackendType
|
The embedding backend name (e.g., 'faiss', 'pinecone') or type to use. |
None
|
base_storage_dir
|
str
|
The directory where embeddings are stored. |
None
|
client
|
EmbeddingClient
|
An optional client instance for generating embeddings. If not provided, no writing operations can take place. |
None
|
provider_type
|
EmbeddingProviderType
|
The respective provider name (e.g., 'openai', 'ollama') or type if using a read-only embedding reader interface. |
None
|
canonical_model_name
|
str
|
The canonical model name to use for the embedding reader interface (e.g., 'text-embedding-3-small:0.6b'). Required for read-only embedding interface to determine which embeddings to retrieve for concepts. Obtained from client if a client is provided, otherwise must be set explicitly for read-only use cases. |
None
|