autointent.generation.Generator#

class autointent.generation.Generator(base_url=None, model_name=None, use_cache=True, client_params=None, **generation_params)#

Wrapper class for accessing OpenAI-compatible API endpoints for LLM generation.

This class provides a unified interface for interacting with OpenAI-compatible APIs, supporting both synchronous and asynchronous operations. It includes built-in caching, retry logic for structured output, and automatic environment variable detection.

The Generator can work with various OpenAI-compatible services including: - OpenAI’s official API - Azure OpenAI - Local inference servers (vLLM, Ollama, etc.) - Other OpenAI-compatible endpoints

Environment Variables:

The following environment variables can be used for configuration:

OPENAI_API_KEY (required):: API key for authentication with the OpenAI-compatible service. This is required for most API endpoints.
OPENAI_BASE_URL (optional):: Base URL for the API endpoint. If not provided, defaults to OpenAI’s API. - https://api.openai.com/v1 (OpenAI official) - https://your-org.openai.azure.com (Azure OpenAI) - http://localhost:8000/v1 (local vLLM server)
OPENAI_MODEL_NAME (optional):: Default model name to use if not specified in the constructor. Examples: “gpt-4o-mini”, “gpt-3.5-turbo”, “claude-3-haiku”

Parameters:

base_url (str | None) – HTTP endpoint for API requests. If None, uses OPENAI_BASE_URL environment variable.
model_name (str | None) – Name of the language model. If None, uses OPENAI_MODEL_NAME environment variable.
use_cache (bool) – Whether to enable caching for structured outputs (default: True).
client_params (dict[str, Any] | None) – Additional parameters passed to the OpenAI client constructor.
**generation_params (dict[str, Any]) – Additional parameters passed to the chat completion API calls.

Example:#

import os
from autointent.generation import Generator

# Method 1: Using environment variables
# Set these in your environment or .env file:
# OPENAI_API_KEY=your-api-key-here
# OPENAI_MODEL_NAME=gpt-4o-mini
# OPENAI_BASE_URL=https://api.openai.com/v1  # optional

generator = Generator()

# Method 2: Explicit configuration
generator = Generator(
    base_url="https://api.openai.com/v1",
    model_name="gpt-4o-mini",
    temperature=0.7,
    max_tokens=1000
)

# Basic chat completion
from autointent.generation.chat_templates import Message, Role

messages = [{"role": Role.USER, "content": "Hello, how are you?"}]
response = generator.get_chat_completion(messages)

raises ValueError:: If model_name is not provided and OPENAI_MODEL_NAME is not set.

model_name#

base_url#

use_cache = True#

client#

async_client#

generation_params#

cache#

get_chat_completion(messages)#

Prompt LLM and return its answer.

Parameters:: messages (list[autointent.generation.chat_templates.Message]) – List of messages to send to the model.
Return type:: str

async get_chat_completion_async(messages)#

Prompt LLM and return its answer asynchronously.

Parameters:: messages (list[autointent.generation.chat_templates.Message]) – List of messages to send to the model.
Return type:: str

async get_structured_output_async(messages, output_model, max_retries=3)#

Prompt LLM and return structured output parsed into the provided Pydantic model asynchronously.

Parameters:

messages (list[autointent.generation.chat_templates.Message]) – List of messages to send to the model.
output_model (type[T]) – Pydantic model class to parse the response into.
max_retries (int) – Maximum number of retry attempts for failed validations.

Returns:

Parsed response as an instance of the provided Pydantic model.

Return type:

get_structured_output_sync(messages, output_model, max_retries=3)#

Prompt LLM and return structured output parsed into the provided Pydantic model.

Parameters:

messages (list[autointent.generation.chat_templates.Message]) – List of messages to send to the model.
output_model (type[T]) – Pydantic model class to parse the response into.
max_retries (int) – Maximum number of retry attempts for failed validations.

Returns:

Parsed response as an instance of the provided Pydantic model.

Return type:

dump(path, exist_ok=True)#

Parameters:

path (pathlib.Path)
exist_ok (bool)

Return type:

None

classmethod load(path)#

Parameters:: path (pathlib.Path)
Return type:: Generator