autointent.generation.Generator#

class autointent.generation.Generator(base_url=None, model_name=None, use_cache=True, client_params=None, **generation_params)#

Wrapper class for accessing OpenAI-compatible API endpoints for LLM generation.

This class provides a unified interface for interacting with OpenAI-compatible APIs, supporting both synchronous and asynchronous operations. It includes built-in caching, retry logic for structured output, and automatic environment variable detection.

The Generator can work with various OpenAI-compatible services including: - OpenAI’s official API - Azure OpenAI - Local inference servers (vLLM, Ollama, etc.) - Other OpenAI-compatible endpoints

Environment Variables:

The following environment variables can be used for configuration:

OPENAI_API_KEY (required):

API key for authentication with the OpenAI-compatible service. This is required for most API endpoints.

OPENAI_BASE_URL (optional):

Base URL for the API endpoint. If not provided, defaults to OpenAI’s API. - https://api.openai.com/v1 (OpenAI official) - https://your-org.openai.azure.com (Azure OpenAI) - http://localhost:8000/v1 (local vLLM server)

OPENAI_MODEL_NAME (optional):

Default model name to use if not specified in the constructor. Examples: “gpt-4o-mini”, “gpt-3.5-turbo”, “claude-3-haiku”

Parameters:
  • base_url (str | None) – HTTP endpoint for API requests. If None, uses OPENAI_BASE_URL environment variable.

  • model_name (str | None) – Name of the language model. If None, uses OPENAI_MODEL_NAME environment variable.

  • use_cache (bool) – Whether to enable caching for structured outputs (default: True).

  • client_params (dict[str, Any] | None) – Additional parameters passed to the OpenAI client constructor.

  • **generation_params (dict[str, Any]) – Additional parameters passed to the chat completion API calls.

Example:#

import os
from autointent.generation import Generator

# Method 1: Using environment variables
# Set these in your environment or .env file:
# OPENAI_API_KEY=your-api-key-here
# OPENAI_MODEL_NAME=gpt-4o-mini
# OPENAI_BASE_URL=https://api.openai.com/v1  # optional

generator = Generator()

# Method 2: Explicit configuration
generator = Generator(
    base_url="https://api.openai.com/v1",
    model_name="gpt-4o-mini",
    temperature=0.7,
    max_tokens=1000
)

# Basic chat completion
from autointent.generation.chat_templates import Message, Role

messages = [{"role": Role.USER, "content": "Hello, how are you?"}]
response = generator.get_chat_completion(messages)
raises ValueError:

If model_name is not provided and OPENAI_MODEL_NAME is not set.

model_name#
base_url#
use_cache = True#
client#
async_client#
generation_params#
cache#
get_chat_completion(messages)#

Prompt LLM and return its answer.

Parameters:

messages (list[autointent.generation.chat_templates.Message]) – List of messages to send to the model.

Return type:

str

async get_chat_completion_async(messages)#

Prompt LLM and return its answer asynchronously.

Parameters:

messages (list[autointent.generation.chat_templates.Message]) – List of messages to send to the model.

Return type:

str

async get_structured_output_async(messages, output_model, max_retries=3)#

Prompt LLM and return structured output parsed into the provided Pydantic model asynchronously.

Parameters:
Returns:

Parsed response as an instance of the provided Pydantic model.

Return type:

T

get_structured_output_sync(messages, output_model, max_retries=3)#

Prompt LLM and return structured output parsed into the provided Pydantic model.

Parameters:
Returns:

Parsed response as an instance of the provided Pydantic model.

Return type:

T

dump(path, exist_ok=True)#
Parameters:
Return type:

None

classmethod load(path)#
Parameters:

path (pathlib.Path)

Return type:

Generator