autointent.Embedder#

class autointent.Embedder(embedder_config)#

A wrapper for managing embedding models with multiple backends.

This class handles initialization, saving, loading, and clearing of embedding models, as well as calculating embeddings for input texts.

Parameters:

embedder_config (autointent.configs._embedder.EmbedderConfig)

config#
train(utterances, labels, config)#

Train the embedding model (only supported for backends with training support).

Parameters:
Return type:

None

clear_ram()#

Move the embedding model to CPU and delete it from memory.

Return type:

None

dump(path)#

Save the embedding model and metadata to disk.

Parameters:

path (pathlib.Path) – Path to the directory where the model will be saved.

Return type:

None

classmethod load(path, override_config=None)#

Load the embedding model and metadata from disk.

Parameters:
  • path (pathlib.Path | str) – Path to the directory where the model is stored.

  • override_config (autointent.configs._embedder.EmbedderConfig | None) – one can override presaved settings

Return type:

Embedder

embed(utterances: list[str], task_type: autointent.configs.TaskTypeEnum | None = None, *, return_tensors: Literal[True]) torch.Tensor#
embed(utterances: list[str], task_type: autointent.configs.TaskTypeEnum | None = None, *, return_tensors: Literal[False] = False) numpy.typing.NDArray[numpy.float32]

Calculate embeddings for a list of utterances.

Parameters:
  • utterances – List of input texts to calculate embeddings for.

  • task_type – Type of task for which embeddings are calculated.

  • return_tensors – If True, return a PyTorch tensor; otherwise, return a numpy array.

Returns:

A numpy array or PyTorch tensor of embeddings.

similarity(embeddings1, embeddings2)#

Calculate similarity between two sets of embeddings.

Parameters:
  • embeddings1 (numpy.typing.NDArray[numpy.float32]) – First set of embeddings (size n).

  • embeddings2 (numpy.typing.NDArray[numpy.float32]) – Second set of embeddings (size m).

Returns:

A numpy array of similarities (size n x m).

Return type:

numpy.typing.NDArray[numpy.float32]