yera.models.interfaces.llms.ollama_interface

Interface to local and remote llms via Ollama.

This module provides the OllamaLLM class for interacting with llms running on an Ollama server. Ollama can run models locally or connect to a remote instance. The interface supports both streaming chat completions and structured output generation using JSON schema constraints. The Ollama server must be running and accessible at the configured endpoint before any llm requests can be made.

Symbols

class OllamaLLM — Interface to local and remote llms via Ollama.

OllamaLLM

Inherits: BaseLLMInterface

Interface to local and remote llms via Ollama.

Provides a wrapper around the Ollama client for interacting with language models running on an Ollama server. Supports both local models (on the same machine) and remote Ollama instances via HTTP. Validates server connectivity on start() and provides both streaming chat completions and structured output generation using JSON schema constraints.

The client is lazily initialised on start() and validates that the Ollama server is accessible before allowing API requests.

Attributes

model_id

type: str

The identifier of the model to use on the Ollama server.

connection

type: OllamaConnection

Connection configuration specifying the Ollama server URL.

client

type: Client

Lazy-initialised Ollama client instance.

Methods

start — Initialise the Ollama client and validate server connectivity.

stop — Shut down and clear the Ollama client.

chat — Stream a chat completion response from Ollama.

make_struct — Stream a structured output response conforming to a provided schema.

OllamaLLM.start

start() → None

Initialise the Ollama client and validate server connectivity.

Creates an Ollama client instance pointing to the configured server URL and verifies that the Ollama server is running and accessible. Must be called before making any API requests.

Raises

ConnectionError

If the Ollama server is not running or not accessible at the configured URL.

OllamaLLM.stop

stop() → None

Shut down and clear the Ollama client.

Releases the Ollama client instance by setting it to None. After calling this method, start() must be called again before further API requests can be made.

OllamaLLM.chat

chat(
    messages: list[Message],
    **ollama_kw,
) → Iterator[str]

Stream a chat completion response from Ollama.

Sends a conversation to the Ollama model and streams the response as text tokens. Supports models with thinking/reasoning capabilities (e.g., deepseek-r1) which are yielded wrapped in markers.

Parameters

messages

type: list[Message]

List of Message objects representing the conversation history.

**ollama_kw

type: str | float | int | bool

Additional keyword arguments passed to the Ollama API.

Raises

ValueError

If the model is not found on the Ollama server. Pull the model with: ollama pull

ConnectionError

If the Ollama server is not accessible.

OllamaLLM.make_struct

make_struct(
    messages: list[Message],
    **ollama_kw,
) → Iterator[str]

Stream a structured output response conforming to a provided schema.

Generates a response that strictly conforms to the structure defined by the provided Pydantic model class. Uses Ollama's native format parameter with a JSON schema to enforce structural compliance.

Parameters

messages

type: list[Message]

List of Message objects representing the conversation history.

cls

type: type[TStruct]

A Pydantic model class defining the desired output structure.

**ollama_kw

type: str | float | int | bool

Additional keyword arguments passed to the Ollama API.

Raises

ValueError

If the model is not found on the Ollama server. Pull the model with: ollama pull

ConnectionError

If the Ollama server is not accessible.

← back to docs