Skip to content

Scoring

omop_graph.graph.scoring

Scoring algorithms for ranking resolved concepts.

This module implements the logic for scoring candidate OMOP concepts based on: 1. Relevance: How well the text matches the query (embeddings + string similarity). 2. Parsimony: Penalizing deep graph traversals (finding a concept far away). 3. Broadness: Rewarding concepts that are more general (higher ancestor count), often useful for finding category headers.

StandardConceptWithScore dataclass

Bases: StandardConcept

A StandardConcept enriched with scoring metrics.

Attributes:

Name Type Description
total_score float

The final calculated score used for ranking. Formula: relevance - parsimony_penalty + broadness_bonus

embedding_score (float, optional)

The cosine similarity score from the embedding model.

relevance float

The composite relevance score (embedding * textual similarity).

parsimony_penalty float

Penalty based on graph distance (separation).

broadness_bonus float

Bonus based on the concept's generality (ancestor count).

from_standard_concept(standard_concept, embedding_score, relevance, parsimony_penalty, broadness_bonus, total_score) classmethod

Factory method to promote a StandardConcept to a scored version.

score_standard_concepts(text, standard_concepts, kg, similarity_scores_with_concept_ids=None)

Rank a list of standard concepts against a query text.

Parameters:

Name Type Description Default
text str

The original query text.

required
standard_concepts tuple[StandardConcept, ...]

The tuple of candidate concepts to score.

required
kg KnowledgeGraph

The graph instance used for retrieving metadata (like ancestor counts).

required
similarity_scores_with_concept_ids Tuple[Mapping[int, float], ...]

Pre-computed embedding similarity scores. The outer tuple corresponds to the query vectors in order, and each inner dictionary maps concept IDs to their similarity scores with the query embedding.

None

Returns:

Type Description
list[StandardConceptWithScore]

The list of concepts with scores attached.