Ollama Responses

Interact with Ollama models using the OpenAI Responses API. This uses Ollama's OpenAI-compatible /v1/responses endpoint, added in Ollama v0.13.3.

Requirements

Ollama v0.13.3 or later
For local usage: Ollama server running at http://localhost:11434
For Ollama Cloud: Set OLLAMA_API_KEY environment variable

Key Features

Dual Deployment: Run locally for privacy or use Ollama Cloud for scalability
Auto-configuration: When using an API key, the host automatically defaults to Ollama Cloud
Stateless API: Each request is independent (no previous_response_id chaining)

Parameters

Parameter	Type	Default	Description
`id`	`str`	`"gpt-oss:20b"`	The ID of the Ollama model to use
`name`	`str`	`"OllamaResponses"`	The name of the model
`provider`	`str`	`"Ollama"`	The provider of the model
`host`	`Optional[str]`	`None`	The Ollama server host (defaults to `http://localhost:11434`)
`api_key`	`Optional[str]`	`None`	The API key for Ollama Cloud (not required for local)
`store`	`Optional[bool]`	`False`	Whether to store responses

Usage

Local Usage

1from kern.agent import Agent
2from kern.models.ollama import OllamaResponses
3
4agent = Agent(
5    model=OllamaResponses(id="gpt-oss:20b"),
6    markdown=True,
7)
8
9agent.print_response("Share a 2 sentence horror story")

Ollama Cloud

Set the OLLAMA_API_KEY environment variable:

1export OLLAMA_API_KEY=your-api-key

1from kern.agent import Agent
2from kern.models.ollama import OllamaResponses
3
4agent = Agent(
5    model=OllamaResponses(id="gpt-oss:20b"),
6    markdown=True,
7)
8
9agent.print_response("Share a 2 sentence horror story")

Custom Host

1from kern.agent import Agent
2from kern.models.ollama import OllamaResponses
3
4agent = Agent(
5    model=OllamaResponses(
6        id="gpt-oss:20b",
7        host="http://my-ollama-server:11434",
8    ),
9    markdown=True,
10)
11
12agent.print_response("Hello!")

Developer Resources

Ollama Responses API Documentation
Ollama (Chat Completion API)