Scoring
omop_graph.graph.scoring
Scoring algorithms for ranking resolved concepts.
This module implements the logic for scoring candidate OMOP concepts based on: 1. Relevance: How well the text matches the query (embeddings + string similarity). 2. Parsimony: Penalizing deep graph traversals (finding a concept far away). 3. Broadness: Rewarding concepts that are more general (higher ancestor count), often useful for finding category headers.
StandardConceptWithScore
dataclass
Bases: StandardConcept
A StandardConcept enriched with scoring metrics.
Attributes:
| Name | Type | Description |
|---|---|---|
total_score |
float
|
The final calculated score used for ranking.
Formula: |
embedding_score |
(float, optional)
|
The cosine similarity score from the embedding model. |
relevance |
float
|
The composite relevance score (embedding * textual similarity). |
parsimony_penalty |
float
|
Penalty based on graph distance (separation). |
broadness_bonus |
float
|
Bonus based on the concept's generality (ancestor count). |
from_standard_concept(standard_concept, embedding_score, relevance, parsimony_penalty, broadness_bonus, total_score)
classmethod
Factory method to promote a StandardConcept to a scored version.
score_standard_concepts(text, standard_concepts, kg, similarity_scores_with_concept_ids=None)
Rank a list of standard concepts against a query text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
The original query text. |
required |
standard_concepts
|
tuple[StandardConcept, ...]
|
The tuple of candidate concepts to score. |
required |
kg
|
KnowledgeGraph
|
The graph instance used for retrieving metadata (like ancestor counts). |
required |
similarity_scores_with_concept_ids
|
Tuple[Mapping[int, float], ...]
|
Pre-computed embedding similarity scores. The outer tuple corresponds to the query vectors in order, and each inner dictionary maps concept IDs to their similarity scores with the query embedding. |
None
|
Returns:
| Type | Description |
|---|---|
list[StandardConceptWithScore]
|
The list of concepts with scores attached. |