Skip to content

API Reference

This reference is automatically generated from the source code.

omop_llm

interface

client

LLMClient dataclass

Base class for LLM clients.

This class replicates the API of the OntoGPT LLMClient but serves as a base for other implementations (e.g., InstructorClient). Relies on the OpenAI client for core functionality.

Parameters:

Name Type Description Default
model str

The name of the model to use (e.g., 'gpt-4', 'llama3').

required
api_base str

The base URL for the API endpoint.

required
api_key str

The API key for authentication.

'ollama'
temperature float

The temperature parameter for generation. Default is 1.0.

1.0
system_message str

The default system message to prepend to chats. Default is "".

''

Attributes:

Name Type Description
_base_client OpenAI

The initialized OpenAI client instance.

_embedding_dim int or None

Cached embedding dimension size.

embedding_dim property

Retrieve the embedding dimension for the current model.

If the dimension is not cached, it attempts to fetch it from the API. Currently supports Ollama endpoints.

Returns:

Type Description
int

The size of the embedding vector.

Raises:

Type Description
ValueError

If model information cannot be found in the Ollama response.

NotImplementedError

If the API base is not supported for automatic dimension retrieval.

cosine_similarity(vecs_a, vecs_b) staticmethod

Compute the cosine similarity between two matrices of vectors.

Parameters:

Name Type Description Default
vecs_a ndarray

A 2D array of vectors (Shape: M x D).

required
vecs_b ndarray

A 2D array of vectors (Shape: N x D).

required

Returns:

Type Description
ndarray

The dot product of the normalized vectors (Shape: M x N).

Notes

A small epsilon (1e-10) is added to the norms to prevent division by zero.

embeddings(text, batch_size=None)

Retrieve embeddings for the given text.

Parameters:

Name Type Description Default
text str or List[str]

The input text or list of texts to embed.

required
batch_size int

The number of texts to process in a single API call. Default is 32.

None

Returns:

Type Description
ndarray

A 2D numpy array containing the embeddings.

Raises:

Type Description
AssertionError

If the base client has not been initialized.

euclidean_distance(text1, text2, **kwargs)

Calculate the Euclidean distance between embeddings of two texts.

Parameters:

Name Type Description Default
text1 str

The first text string.

required
text2 str

The second text string.

required
**kwargs Any

Additional arguments passed to the embedding function.

{}

Returns:

Type Description
float

The Euclidean distance (L2 norm) between the two embedding vectors.

similarity(terms, terms_to_match, **kwargs)

Calculate the cosine similarity between two sets of terms.

This method handles inputs as strings, lists of strings, or pre-computed numpy arrays of embeddings.

Parameters:

Name Type Description Default
terms str, List[str], or np.ndarray

The source terms or embeddings.

required
terms_to_match str, List[str], or np.ndarray

The target terms or embeddings to match against.

required
**kwargs Any

Additional arguments passed to the embedding function if embedding is required.

{}

Returns:

Type Description
ndarray

A similarity matrix.

Raises:

Type Description
ValueError

If inputs are not strings, lists, or numpy arrays.

LLMClientError

Bases: RuntimeError

Custom exception for LLM Client runtime errors.

instructor_client

InstructorClient dataclass

Bases: LLMClient

LLMClient implementation backed by pydantic-instructor.

This client extends the base LLMClient to support structured outputs via the instructor library.

Parameters:

Name Type Description Default
instructor_mode Mode

The mode for the instructor client (e.g., JSON, TOOLS). Default is JSON.

JSON

Attributes:

Name Type Description
_client Any

The initialized instructor client wrapper.

__post_init__()

Initialize the Instructor wrapper around the OpenAI client.

complete(messages, response_model=None, show_prompt=False, **kwargs)

Run a chat completion.

If response_model is provided, structured output is returned based on the Pydantic model. Otherwise, plain text is returned.

Parameters:

Name Type Description Default
messages list of dict

The list of chat messages (e.g., [{'role': 'user', 'content': '...'}]).

required
response_model type[T]

A Pydantic model class (T) to structure the response. Must be a subclass of BaseModel.

None
show_prompt bool

If True, logs the rendered prompt before sending. Default is False.

False
**kwargs Any

Additional arguments passed to chat.completions.create.

{}

Returns:

Type Description
Union[str, T]

The response string (if no model provided) or an instance of T (the Pydantic model).

Raises:

Type Description
LLMClientError

If the completion request fails.

messages_from_prompt_template(prompt_template, text)

Generate a list of messages from a PromptTemplate.

Raises:

Type Description
NotImplementedError

This method is currently dropped for Template in LinkML.

render_prompt_messages(messages)

Render a list of chat messages into a single string for logging or display.

Parameters:

Name Type Description Default
messages list of dict

The chat history.

required

Returns:

Type Description
str

A formatted string representation of the conversation.