autointent.generation.Generator#
- class autointent.generation.Generator(base_url=None, model_name=None, use_cache=True, client_params=None, **generation_params)#
- Wrapper class for accessing OpenAI-compatible API endpoints for LLM generation. - This class provides a unified interface for interacting with OpenAI-compatible APIs, supporting both synchronous and asynchronous operations. It includes built-in caching, retry logic for structured output, and automatic environment variable detection. - The Generator can work with various OpenAI-compatible services including: - OpenAI’s official API - Azure OpenAI - Local inference servers (vLLM, Ollama, etc.) - Other OpenAI-compatible endpoints - Environment Variables:
- The following environment variables can be used for configuration: - OPENAI_API_KEY (required):
- API key for authentication with the OpenAI-compatible service. This is required for most API endpoints. 
- OPENAI_BASE_URL (optional):
- Base URL for the API endpoint. If not provided, defaults to OpenAI’s API. - https://api.openai.com/v1 (OpenAI official) - https://your-org.openai.azure.com (Azure OpenAI) - http://localhost:8000/v1 (local vLLM server) 
- OPENAI_MODEL_NAME (optional):
- Default model name to use if not specified in the constructor. Examples: “gpt-4o-mini”, “gpt-3.5-turbo”, “claude-3-haiku” 
 
 - Parameters:
- base_url (str | None) – HTTP endpoint for API requests. If None, uses OPENAI_BASE_URL environment variable. 
- model_name (str | None) – Name of the language model. If None, uses OPENAI_MODEL_NAME environment variable. 
- use_cache (bool) – Whether to enable caching for structured outputs (default: True). 
- client_params (dict[str, Any] | None) – Additional parameters passed to the OpenAI client constructor. 
- **generation_params (dict[str, Any]) – Additional parameters passed to the chat completion API calls. 
 
 - Example:#- import os from autointent.generation import Generator # Method 1: Using environment variables # Set these in your environment or .env file: # OPENAI_API_KEY=your-api-key-here # OPENAI_MODEL_NAME=gpt-4o-mini # OPENAI_BASE_URL=https://api.openai.com/v1 # optional generator = Generator() # Method 2: Explicit configuration generator = Generator( base_url="https://api.openai.com/v1", model_name="gpt-4o-mini", temperature=0.7, max_tokens=1000 ) # Basic chat completion from autointent.generation.chat_templates import Message, Role messages = [{"role": Role.USER, "content": "Hello, how are you?"}] response = generator.get_chat_completion(messages) - raises ValueError:
- If model_name is not provided and OPENAI_MODEL_NAME is not set. 
 - model_name#
 - base_url#
 - use_cache = True#
 - client#
 - async_client#
 - generation_params#
 - cache#
 - get_chat_completion(messages)#
- Prompt LLM and return its answer. - Parameters:
- messages (list[autointent.generation.chat_templates.Message]) – List of messages to send to the model. 
- Return type:
 
 - async get_chat_completion_async(messages)#
- Prompt LLM and return its answer asynchronously. - Parameters:
- messages (list[autointent.generation.chat_templates.Message]) – List of messages to send to the model. 
- Return type:
 
 - async get_structured_output_async(messages, output_model, max_retries=3)#
- Prompt LLM and return structured output parsed into the provided Pydantic model asynchronously. - Parameters:
- messages (list[autointent.generation.chat_templates.Message]) – List of messages to send to the model. 
- output_model (type[T]) – Pydantic model class to parse the response into. 
- max_retries (int) – Maximum number of retry attempts for failed validations. 
 
- Returns:
- Parsed response as an instance of the provided Pydantic model. 
- Return type:
- T 
 
 - get_structured_output_sync(messages, output_model, max_retries=3)#
- Prompt LLM and return structured output parsed into the provided Pydantic model. - Parameters:
- messages (list[autointent.generation.chat_templates.Message]) – List of messages to send to the model. 
- output_model (type[T]) – Pydantic model class to parse the response into. 
- max_retries (int) – Maximum number of retry attempts for failed validations. 
 
- Returns:
- Parsed response as an instance of the provided Pydantic model. 
- Return type:
- T 
 
 - dump(path, exist_ok=True)#
- Parameters:
- path (pathlib.Path) 
- exist_ok (bool) 
 
- Return type:
- None 
 
 - classmethod load(path)#
- Parameters:
- path (pathlib.Path) 
- Return type: