Skip to content

KnowledgeGraph

omop_graph.graph.kg

OMOP-backed graph facade.

This module provides the KnowledgeGraph class, which acts as the primary interface (facade) to the OMOP Common Data Model database.

Responsibilities
  • SQLAlchemy Access: Manages the database session and executes queries.
  • Caching: Implements LRU caching for high-frequency lookups (concepts, predicates).
  • Predicate Semantics: Resolves relationship IDs to Predicate objects and Kinds.
  • Edge/Node Retrieval: Provides methods to traverse the graph (parents, children, edges).

KnowledgeGraph

Bases: GraphBackend

The main entry point for interacting with the OMOP Graph.

This class wraps a SQLAlchemy session and provides high-level methods to query concepts, relationships, and metadata.

Parameters:

Name Type Description Default
session_factory sessionmaker

The SQLAlchemy sessionmaker factory capable of creating separate sessions for each database access.

required

emb property

Namespace for all embedding operations.

Returns EmbeddingInterface if _emb_client is set (for write operations), otherwise returns EmbeddingReader (for read-only operations).

The interface/reader is created lazily on first access using _emb_backend and _emb_base_storage_dir. Backend resolution follows omop_emb rules: explicit backend argument first, then OMOP_EMB_BACKEND.

children(concept_id) cached

Retrieve children Concept IDs of concept using Concept_Ancestor table.

clear_caches()

Clear all LRU caches associated with the graph.

concept_id_by_code(vocabulary_id, concept_code) cached

Look up a Concept ID using the vocabulary ID and concept code.

Parameters:

Name Type Description Default
vocabulary_id str

The vocabulary ID (e.g., 'SNOMED', 'RxNorm').

required
concept_code str

The source code within that vocabulary.

required

Returns:

Type Description
int

The resolved OMOP Concept ID.

concept_ids_by_label(label) cached

Find concept IDs that match the label exactly (case-insensitive).

concept_lookup(label, match_kind, synonym=False, search_constraint=None, sort=True) cached

Resolve a label to concept_id(s).

Parameters:

Name Type Description Default
label str

The term to search for.

required
match_kind LabelMatchKind

The kind of match to perform (exact, fulltext, partial).

required
synonym bool

If True, searches in Concept_Synonym instead of Concept.

False
search_constraint SearchConstraintConcept

Additional filters for domain/vocabulary.

None

concept_view(concept_id) cached

Retrieve a single concept view by ID.

Parameters:

Name Type Description Default
concept_id int

The OMOP Concept ID.

required

Returns:

Type Description
ConceptView

The immutable view of the concept.

concept_views(concept_ids, sort=True) cached

Retrieve multiple concept views in a batch.

Parameters:

Name Type Description Default
concept_ids tuple[int, ...]

A tuple of OMOP Concept IDs.

required

Returns:

Type Description
tuple[ConceptView, ...]

A tuple of concept views.

edges(concept_ids, direction, predicate_ids=None, predicate_kinds=None, active_only=True, on=None, within_domain=True) cached

Convenience method to retrieve all edges from one or multiple concepts.

Parameters:

Name Type Description Default
concept_ids (int, tuple[int, ...])

The source/target concept ID(s).

required
direction str

'out' for outgoing, 'in' for incoming.

required
predicate_ids frozenset[str]

Filter by specific relationship IDs.

None
predicate_kinds Set[ClassIDEnum]

Filter by semantic kind of relationship.

None
active_only bool

If True, return only valid/active edges.

True
on date

Check validity on a specific date.

None
within_domain bool

If True, only return edges where source/target domains match.

True

get_all_concept_domain_ids()

Retrieve all distinct Domain IDs present in the concept table.

get_all_concept_vocabulary_ids()

Retrieve all distinct Vocabulary IDs present in the concept table.

get_num_ancestors(concept_ids)

Get the count of ancestors for a batch of concepts.

get_potential_ancestor(child_id, parent_id)

Check if an ancestry relationship exists between a child and parent.

leaves(domain_id=None, vocabulary_id=None) cached

Retrieve leaf concepts (no children).

parents(concept_id) cached

Retrieve parent Concept IDs of concept using Concept_Ancestor table.

predicate(relationship_id) cached

Retrieve a Predicate object by its relationship ID.

Parameters:

Name Type Description Default
relationship_id str

The OMOP relationship ID (e.g., 'maps to').

required

Returns:

Type Description
Predicate

The predicate definition.

predicate_kind(relationship_id)

Classify the predicate into a semantic kind.

predicate_kinds(relationship_ids)

Classify a batch of predicates.

predicate_name(relationship_id) cached

Retrieve the human-readable name of a relationship.

predicates()

Return all predicates known to the knowledge graph.

relationships(session, subjects, predicates, objects, invert=False)

Query relationships between concepts.

Parameters:

Name Type Description Default
subjects list[CURIE] | None

List of subject CURIEs.

required
predicates list[str] | None

List of predicate (relationship) IDs.

required
objects list[CURIE] | None

List of object CURIEs.

required
invert bool

If True, swaps subjects and objects in the query and result.

False

Yields:

Type Description
Tuple[CURIE, PRED_CURIE, CURIE]

Triples (subject, predicate, object).

reverse_predicate_id(relationship_id)

Get the reverse relationship ID, if it exists.

rollback_session()

Safely rollback the session if in a pending state.

roots(domain_id=None, vocabulary_id=None) cached

Retrieve root concepts (no parents).

singletons(domain_id=None, vocabulary_id=None) cached

Retrieve singleton concepts (no parents and no children).

specificity(concept_id)

Compute specificity as the inverse of out-degree. Higher is more specific.

synonyms_for_concept(concept_id) cached

Retrieve all synonyms for a concept.

KnowledgeGraphEmbeddingConfiguration dataclass

Configuration for embedding-based operations in the knowledge graph.

Parameters:

Name Type Description Default
backend_type EmbeddingBackendType

The embedding backend name (e.g., 'faiss', 'pinecone') or type to use.

None
base_storage_dir str

The directory where embeddings are stored.

None
client EmbeddingClient

An optional client instance for generating embeddings. If not provided, no writing operations can take place.

None
provider_type EmbeddingProviderType

The respective provider name (e.g., 'openai', 'ollama') or type if using a read-only embedding reader interface.

None
canonical_model_name str

The canonical model name to use for the embedding reader interface (e.g., 'text-embedding-3-small:0.6b'). Required for read-only embedding interface to determine which embeddings to retrieve for concepts. Obtained from client if a client is provided, otherwise must be set explicitly for read-only use cases.

None