autointent.Context#
- class autointent.Context(seed=42)#
Context manager for configuring and managing data handling, vector indexing, and optimization.
This class provides methods to set up logging, configure data and vector index components, manage datasets, and retrieve various configurations for inference and optimization.
- Parameters:
seed (int)
- data_handler: autointent.context.data_handler.DataHandler#
- vector_index_client: autointent.context.vector_index_client.VectorIndexClient#
- optimization_info: autointent.context.optimization_info.OptimizationInfo#
- callback_handler#
- seed = 42#
- configure_logging(config)#
Configure logging settings.
- Parameters:
config (autointent.configs.LoggingConfig) – Logging configuration settings.
- Return type:
None
- configure_vector_index(config, embedder_config=None)#
Configure the vector index client and embedder.
- Parameters:
config (autointent.configs.VectorIndexConfig) – Configuration for the vector index.
embedder_config (autointent.configs.EmbedderConfig | None) – Configuration for the embedder. If None, a default EmbedderConfig is used.
- Return type:
None
- configure_data(config)#
Configure data handling.
- Parameters:
config (autointent.configs.DataConfig) – Configuration for the data handling process.
- Return type:
None
- set_dataset(dataset, force_multilabel=False)#
Set the datasets for training, validation and testing.
- Parameters:
dataset (autointent.Dataset) – Dataset.
force_multilabel (bool) – Whether to force multilabel classification.
- Return type:
None
- get_inference_config()#
Generate configuration settings for inference.
- dump()#
Save logs, configurations, and datasets to disk.
Dumps evaluation results, training/test data splits, and inference configurations to the specified logging directory.
- Return type:
None
- get_db_dir()#
Get the database directory of the vector index.
- Returns:
Path to the database directory.
- Return type:
- get_device()#
Get the embedder device used by the vector index client.
- Returns:
Device name.
- Return type:
- get_max_length()#
Get the maximum sequence length for embeddings.
- Returns:
Maximum length or None if not set.
- Return type:
int | None
- get_use_cache()#
Check if caching is enabled for the embedder.
- Returns:
True if caching is enabled, False otherwise.
- Return type:
- get_dump_dir()#
Get the directory for saving dumped modules.
- Returns:
Path to the dump directory or None if dumping is disabled.
- Return type:
pathlib.Path | None
- is_multilabel()#
Check if the dataset is configured for multilabel classification.
- Returns:
True if multilabel classification is enabled, False otherwise.
- Return type:
- get_n_classes()#
Get the number of classes in the dataset.
- Returns:
Number of classes.
- Return type:
- is_ram_to_clear()#
Check if RAM clearing is enabled in the logging configuration.
- Returns:
True if RAM clearing is enabled, False otherwise.
- Return type: