autointent.generation.Generator#
- class autointent.generation.Generator(base_url=None, model_name=None, use_cache=True, client_params=None, **generation_params)#
Wrapper class for accessing OpenAI-compatible API endpoints for LLM generation.
This class provides a unified interface for interacting with OpenAI-compatible APIs, supporting both synchronous and asynchronous operations. It includes built-in caching, retry logic for structured output, and automatic environment variable detection.
The Generator can work with various OpenAI-compatible services including: - OpenAI’s official API - Azure OpenAI - Local inference servers (vLLM, Ollama, etc.) - Other OpenAI-compatible endpoints
- Environment Variables:
The following environment variables can be used for configuration:
- OPENAI_API_KEY (required):
API key for authentication with the OpenAI-compatible service. This is required for most API endpoints.
- OPENAI_BASE_URL (optional):
Base URL for the API endpoint. If not provided, defaults to OpenAI’s API. - https://api.openai.com/v1 (OpenAI official) - https://your-org.openai.azure.com (Azure OpenAI) - http://localhost:8000/v1 (local vLLM server)
- OPENAI_MODEL_NAME (optional):
Default model name to use if not specified in the constructor. Examples: “gpt-4o-mini”, “gpt-3.5-turbo”, “claude-3-haiku”
- Parameters:
base_url (str | None) – HTTP endpoint for API requests. If None, uses OPENAI_BASE_URL environment variable.
model_name (str | None) – Name of the language model. If None, uses OPENAI_MODEL_NAME environment variable.
use_cache (bool) – Whether to enable caching for structured outputs (default: True).
client_params (dict[str, Any] | None) – Additional parameters passed to the OpenAI client constructor.
**generation_params (dict[str, Any]) – Additional parameters passed to the chat completion API calls.
Example:#
import os from autointent.generation import Generator # Method 1: Using environment variables # Set these in your environment or .env file: # OPENAI_API_KEY=your-api-key-here # OPENAI_MODEL_NAME=gpt-4o-mini # OPENAI_BASE_URL=https://api.openai.com/v1 # optional generator = Generator() # Method 2: Explicit configuration generator = Generator( base_url="https://api.openai.com/v1", model_name="gpt-4o-mini", temperature=0.7, max_tokens=1000 ) # Basic chat completion from autointent.generation.chat_templates import Message, Role messages = [{"role": Role.USER, "content": "Hello, how are you?"}] response = generator.get_chat_completion(messages)
- raises ValueError:
If model_name is not provided and OPENAI_MODEL_NAME is not set.
- model_name#
- base_url#
- use_cache = True#
- client#
- async_client#
- generation_params#
- cache#
- get_chat_completion(messages)#
Prompt LLM and return its answer.
- Parameters:
messages (list[autointent.generation.chat_templates.Message]) – List of messages to send to the model.
- Return type:
- async get_chat_completion_async(messages)#
Prompt LLM and return its answer asynchronously.
- Parameters:
messages (list[autointent.generation.chat_templates.Message]) – List of messages to send to the model.
- Return type:
- async get_structured_output_async(messages, output_model, max_retries=3)#
Prompt LLM and return structured output parsed into the provided Pydantic model asynchronously.
- Parameters:
messages (list[autointent.generation.chat_templates.Message]) – List of messages to send to the model.
output_model (type[T]) – Pydantic model class to parse the response into.
max_retries (int) – Maximum number of retry attempts for failed validations.
- Returns:
Parsed response as an instance of the provided Pydantic model.
- Return type:
T
- get_structured_output_sync(messages, output_model, max_retries=3)#
Prompt LLM and return structured output parsed into the provided Pydantic model.
- Parameters:
messages (list[autointent.generation.chat_templates.Message]) – List of messages to send to the model.
output_model (type[T]) – Pydantic model class to parse the response into.
max_retries (int) – Maximum number of retry attempts for failed validations.
- Returns:
Parsed response as an instance of the provided Pydantic model.
- Return type:
T
- dump(path, exist_ok=True)#
- Parameters:
path (pathlib.Path)
exist_ok (bool)
- Return type:
None
- classmethod load(path)#
- Parameters:
path (pathlib.Path)
- Return type: