Skip to Content

Ollama Adapter

The Ollama adapter provides integration with Ollama’s API, supporting any model available through the API.

Features

  • Logging: Comprehensive logging with request context and performance metrics
  • Health Checks: Built-in health check to verify API connectivity
  • Token Tracking: Automatic token usage tracking for cost monitoring

Basic Usage

from parsec.models.adapters import OllamaAdapter adapter = OllamaAdapter( model="llama3", base_url="http://localhost:11434" ) # Generate a response result = await adapter.generate("What is the capital of France?") print(result.output) # "Paris" print(result.tokens_used) # e.g., 25 print(result.latency_ms) # e.g., 342.5

Structured Output with Schema

The Gemini adapter uses native JSON mode when a schema is provided:

schema = { "type": "object", "properties": { "name": {"type": "string"}, "age": {"type": "integer"}, "email": {"type": "string"} }, "required": ["name", "age"] } result = await adapter.generate( "Extract: John Doe is 30 years old, john@example.com", schema=schema, temperature=0.7 ) print(result.output) # '{"name": "John Doe", "age": 30, "email": "john@example.com"}'

Streaming

Streaming support is still under developement.

Configuration Options

ParameterTypeDefaultDescription
modelstrRequiredModel name (e.g., “llama3”)
temperaturefloat0.7Sampling temperature (0.0 to 2.0)
max_output_tokensintNoneMaximum tokens to generate
schemadictNoneJSON schema for structured output

Logging

The adapter includes comprehensive logging:

import logging logging.basicConfig(level=logging.INFO) # Logs will show: # INFO - Generating response from Ollama model llama3 # DEBUG - Success: 25 tokens

Health Check

Verify API connectivity:

is_healthy = await adapter.health_check() if is_healthy: print("Ollama API is accessible")

Supported Models

  • llama3 - Best for text tasks

Error Handling

try: result = await adapter.generate("Hello") except Exception as e: # Logs automatically include full stack trace print(f"Generation failed: {e}")

Important Notes

Token Counting

The adapter reports token usage when available:

  • prompt_token_count - Input tokens
  • candidates_token_count - Output tokens
  • Total reported in tokens_used
Last updated on