autointent.Embedder#
- class autointent.Embedder(embedder_config)#
A wrapper for managing embedding models with multiple backends.
This class handles initialization, saving, loading, and clearing of embedding models, as well as calculating embeddings for input texts.
- Parameters:
embedder_config (autointent.configs._embedder.EmbedderConfig)
- config#
- train(utterances, labels, config)#
Train the embedding model (only supported for backends with training support).
- Parameters:
labels (autointent.custom_types.ListOfLabels) – List of labels corresponding to utterances.
config (autointent.configs.EmbedderFineTuningConfig) – Fine-tuning configuration.
- Return type:
None
- clear_ram()#
Move the embedding model to CPU and delete it from memory.
- Return type:
None
- dump(path)#
Save the embedding model and metadata to disk.
- Parameters:
path (pathlib.Path) – Path to the directory where the model will be saved.
- Return type:
None
- classmethod load(path, override_config=None)#
Load the embedding model and metadata from disk.
- Parameters:
path (pathlib.Path | str) – Path to the directory where the model is stored.
override_config (autointent.configs._embedder.EmbedderConfig | None) – one can override presaved settings
- Return type:
- embed(utterances: list[str], task_type: autointent.configs.TaskTypeEnum | None = None, *, return_tensors: Literal[True]) torch.Tensor#
- embed(utterances: list[str], task_type: autointent.configs.TaskTypeEnum | None = None, *, return_tensors: Literal[False] = False) numpy.typing.NDArray[numpy.float32]
Calculate embeddings for a list of utterances.
- Parameters:
utterances – List of input texts to calculate embeddings for.
task_type – Type of task for which embeddings are calculated.
return_tensors – If True, return a PyTorch tensor; otherwise, return a numpy array.
- Returns:
A numpy array or PyTorch tensor of embeddings.
- similarity(embeddings1, embeddings2)#
Calculate similarity between two sets of embeddings.
- Parameters:
embeddings1 (numpy.typing.NDArray[numpy.float32]) – First set of embeddings (size n).
embeddings2 (numpy.typing.NDArray[numpy.float32]) – Second set of embeddings (size m).
- Returns:
A numpy array of similarities (size n x m).
- Return type:
numpy.typing.NDArray[numpy.float32]