autointent.modules.scoring.DescriptionScorer#

class autointent.modules.scoring.DescriptionScorer(embedder_name, temperature=1.0, embedder_device='cpu', batch_size=32, max_length=None, embedder_use_cache=False)#

Bases: autointent.modules.abc.ScoringModule

Scoring module that scores utterances based on similarity to intent descriptions.

DescriptionScorer embeds both the utterances and the intent descriptions, then computes a similarity score between the two, using either cosine similarity and softmax.

Variables:
  • weights_file_name – Filename for saving the description vectors (description_vectors.npy).

  • embedder – The embedder used to generate embeddings for utterances and descriptions.

  • precomputed_embeddings – Flag indicating whether precomputed embeddings are used.

  • embedding_model_subdir – Directory for storing the embedder’s model files.

  • _vector_index – Internal vector index used when embeddings are precomputed.

  • db_dir – Directory path where the vector database is stored.

  • name – Name of the scorer, defaults to “description”.

Parameters:
  • embedder_name (str)

  • temperature (float)

  • embedder_device (str)

  • batch_size (int)

  • max_length (int | None)

  • embedder_use_cache (bool)

weights_file_name: str = 'description_vectors.npy'#
embedder: autointent.Embedder#
precomputed_embeddings: bool = False#
embedding_model_subdir: str = 'embedding_model'#
db_dir: str#
name = 'description'#
temperature = 1.0#
embedder_device = 'cpu'#
embedder_name#
batch_size = 32#
max_length = None#
embedder_use_cache = False#
classmethod from_context(context, temperature, embedder_name=None)#

Create a DescriptionScorer instance using a Context object.

Parameters:
  • context (autointent.Context) – Context containing configurations and utilities.

  • temperature (float) – Temperature parameter for scaling logits.

  • embedder_name (str | None) – Name of the embedder model. If None, the best embedder is used.

Returns:

Initialized DescriptionScorer instance.

Return type:

DescriptionScorer

get_embedder_name()#

Get the name of the embedder.

Returns:

Embedder name.

Return type:

str

fit(utterances, labels, descriptions)#

Fit the scorer by embedding utterances and descriptions.

Parameters:
  • utterances (list[str]) – List of utterances to embed.

  • labels (list[autointent.custom_types.LabelType]) – List of labels corresponding to the utterances.

  • descriptions (list[str]) – List of intent descriptions.

Raises:

ValueError – If descriptions contain None values or embeddings mismatch utterances.

Return type:

None

predict(utterances)#

Predict scores for utterances based on similarity to intent descriptions.

Parameters:

utterances (list[str]) – List of utterances to score.

Returns:

Array of probabilities for each utterance.

Return type:

numpy.typing.NDArray[numpy.float64]

clear_cache()#

Clear cached data in memory used by the embedder.

Return type:

None

dump(path)#

Save the scorer’s metadata, description vectors, and embedder state.

Parameters:

path (str) – Path to the directory where assets will be dumped.

Return type:

None

load(path)#

Load the scorer’s metadata, description vectors, and embedder state.

Parameters:

path (str) – Path to the directory containing the dumped assets.

Return type:

None