Skip to content

Services — API Reference

Application Composition

build_application(config) is the shared composition root for direct Python consumers. It builds the adapters, the service container, and returns a GroundworkersApp with both attached.

groundworkers.app

VocabService

VocabService provides vocabulary search and concept navigation over OMOP CDM vocabulary tables. It is the backing service for the search MCP tools and a dependency of MappingService.

groundworkers.services.vocab

ConceptMatch dataclass

A single candidate returned by search_exact, search_normalized, or search_fulltext.

MappedConcept dataclass

A standard concept that a source concept maps to.

RelatedConceptMapping dataclass

Relationship-driven mapping result for a single source concept_id.

StandardMapping dataclass

Navigation result for a single source concept_id.

VocabService

Direct Python API for OMOP vocabulary search and concept navigation.

Exposes raw quality signals (ts_rank, standard_concept flag) so callers can apply their own quality thresholds and decide whether to navigate non-standard results to their standard equivalents.

Raises GroundworkersError for database or query errors. Raises ValueError for invalid arguments.

fts_available property

True when the concept_name_tsvector sidecar column is present.

navigate_to_standard(concept_ids)

Return standard equivalents for a list of concept_ids via "Maps to" relationship edges.

For concept_ids that are already standard: standard_concepts = [self]. For concept_ids with no outbound "Maps to" relationship: standard_concepts = []. concept_ids not found in the vocabulary are silently omitted.

navigate_to_unit(concept_ids)

Return "Maps to unit" related concepts for the given concept_ids.

navigate_to_value(concept_ids)

Return "Maps to value" related concepts for the given concept_ids.

search_exact(query, *, domain=None, vocabulary_id=None, standard_only=False, active_only=False, include_synonyms=True, parent_ids=None, limit=20)

Case-insensitive exact match against concept_name and optionally concept_synonym_name.

standard_only defaults to False so the caller can inspect non-standard candidates and decide whether to navigate to their standard equivalents.

Returns name matches before synonym matches; deduplicates by concept_id so a concept that matches both name and synonym only appears once.

search_fulltext(query, *, domain=None, vocabulary_id=None, standard_only=False, active_only=False, include_synonyms=True, parent_ids=None, min_rank=0.0, limit=20)

PostgreSQL FTS match using the tsvector sidecar column (GIN-indexed).

Returns (results, fts_available). When fts_available is False the sidecar column was not detected and results is always []; the caller should fall through to another search strategy.

ts_rank is included in each result so the caller can apply its own quality threshold. Synonym FTS is included when the synonym sidecar column is also present; otherwise synonym results are silently omitted.

search_normalized(query, *, domain=None, vocabulary_id=None, standard_only=False, active_only=False, include_synonyms=False, normalization_profile='verbatim', parent_ids=None, remove_stop_phrases=True, limit=20)

Deterministic near-verbatim search after text normalization.

Both the query and candidate text are normalized before comparison. Distinct from full-text search: deterministic equality, not ranked retrieval.

normalize_text_for_matching(text, *, profile='verbatim', remove_stop_phrases=True)

Normalize free text into a deterministic matching form.

serialise_concept_match(match)

Serialise a ConceptMatch to a JSON-safe dict for MCP tool responses.

Serialise a RelatedConceptMapping to a JSON-safe dict for MCP tool responses.

serialise_standard_mapping(mapping)

Serialise a StandardMapping to a JSON-safe dict for MCP tool responses.

MappingService

MappingService is the direct-Python API for mapping workflows. The mapping MCP tools delegate to this service rather than implementing orchestration in the tool module.

groundworkers.services.mapping

MappingService

Direct Python API for mapping-oriented vocabulary workflows.

TextService

TextService provides LLM-backed clinical text preprocessing. The text MCP tools (text_normalize, text_decompose, text_disambiguate) delegate to this service.

groundworkers.services.text

LLM-backed clinical text semantics.

This package keeps the public TextService surface small and stable while separating result models, prompt definitions, and service orchestration into modules that can grow independently.

DecomposeResult

Bases: BaseModel

Result of decomposing free text into a list of clinical search terms.

DecomposeTerm

Bases: BaseModel

One extracted clinical concept from a decomposition.

DisambiguateResult

Bases: BaseModel

Result of listing all plausible interpretations of an ambiguous term.

Interpretation

Bases: BaseModel

One candidate interpretation of an ambiguous term.

MappingCleanupResult

Bases: BaseModel

Result of rewriting source text into a more mappable search phrase.

NormalizeResult

Bases: BaseModel

Result of a single-term normalization.

TextService

Direct Python API for LLM-backed clinical text preprocessing.

TextService interprets caller-provided clinical phrases. It normalizes single terms, decomposes multi-concept free text, and surfaces ranked interpretations when the input is ambiguous. The outputs are typed and ready to feed into downstream concept grounding workflows.

All methods raise ValueError for invalid input and GroundworkersError for LLM backend failures or malformed structured responses.

decompose(text, *, domain_hint=None, max_terms=10, model_name=None)

Decompose a free-text clinical description into normalized search terms.

disambiguate(text, *, domain_hint=None, max_interpretations=5, model_name=None)

List all plausible clinical interpretations of an ambiguous term.

mapping_cleanup(text, *, context=None, domain_hint=None, model_name=None)

Rewrite source text into a more mappable OMOP search phrase.

normalize(text, *, domain_hint=None, model_name=None)

Normalize a clinical term, abbreviation, lay phrase, or misspelling.

build_user_prompt(operation, text, **kwargs)

Construct the user-turn prompt for a text operation.

Shared between TextService methods and MCP prompt handlers so both present the same request surface to the LLM.

Clamping is applied here so prompt handlers and service calls always show the same bounded values.

DomainService

DomainService provides LLM-backed batch OMOP domain classification for structured field labels and example values. The domain_classify MCP tool delegates to this service.

groundworkers.services.domain

DomainService — LLM-assisted OMOP domain classification for data dictionary attributes.

Accepts a batch of field labels and their example response values, and returns a mapping of label → OMOP domain string for any label that can be confidently classified. Labels that yield null or an unrecognised domain are omitted so that callers fall through to the next resolution tier (keyword heuristics).

Valid domains returned: Measurement, Condition, Observation, Procedure, Drug, Device, Metadata, Identifier.

"Metadata" and "Identifier" are groundcrew-internal sentinel values, not OMOP CDM domains. The ingester treats them as skip signals and does not create SourceItems for those rows.

DomainService

Classify data dictionary field labels into OMOP CDM domains via the LLM adapter.

classify_attributes(label_values, model_name=None)

Classify field labels into OMOP domains.

label_values maps each field label text to a (possibly empty) list of example response-value strings. Returns a dict containing only labels that received a valid domain string — labels mapped to null or an unrecognised value are excluded so callers fall through to the next tier.

Raises BACKEND_UNAVAIL when the LLM cannot be reached. Raises QUERY_ERROR when the response is not valid JSON.