autointent.modules.scoring.DescriptionScorer#

class autointent.modules.scoring.DescriptionScorer(embedder_config=None, temperature=1.0)#

Bases: autointent.modules.base.BaseScorer

Scoring module that scores utterances based on similarity to intent descriptions.

DescriptionScorer embeds both the utterances and the intent descriptions, then computes a similarity score between the two, using either cosine similarity and softmax.

Parameters:

embedder_config (autointent.configs.EmbedderConfig | str | dict[str, Any] | None) – Config of the embedder model
temperature (pydantic.PositiveFloat) – Temperature parameter for scaling logits, defaults to 1.0

name = 'description'#: Name of the module.

supports_multiclass = True#: Whether the module supports multiclass classification

supports_multilabel = True#: Whether the module supports multilabel classification

temperature = 1.0#

embedder_config#

classmethod from_context(context, temperature, embedder_config=None)#

Create a DescriptionScorer instance using a Context object.

Parameters:

context (autointent.Context) – Context containing configurations and utilities
temperature (pydantic.PositiveFloat) – Temperature parameter for scaling logits
embedder_config (autointent.configs.EmbedderConfig | str | None) – Config of the embedder model. If None, the best embedder is used

Returns:

Initialized DescriptionScorer instance

Return type:

DescriptionScorer

get_embedder_config()#

Get the name of the embedder.

Returns:: Embedder name
Return type:: dict[str, Any]

fit(utterances, labels, descriptions)#

Fit the scorer by embedding utterances and descriptions.

Parameters:

utterances (list[str]) – List of utterances to embed
labels (autointent.custom_types.ListOfLabels) – List of labels corresponding to the utterances
descriptions (list[str]) – List of intent descriptions

Raises:

ValueError – If descriptions contain None values or embeddings mismatch utterances

Return type:

None

predict(utterances)#

Predict scores for utterances based on similarity to intent descriptions.

Parameters:: utterances (list[str]) – List of utterances to score
Returns:: Array of probabilities for each utterance
Return type:: numpy.typing.NDArray[numpy.float64]

clear_cache()#

Clear cached data in memory used by the embedder.

Return type:: None

get_train_data(context)#

Get training data from context.

Parameters:: context (autointent.Context) – Context containing training data
Returns:: Tuple containing utterances, labels, and descriptions
Return type:: tuple[list[str], autointent.custom_types.ListOfLabels, list[str]]

score_cv(context, metrics)#

Evaluate the scorer on a test set and compute the specified metrics.

Parameters:

context (autointent.Context) – Context containing test set and other data
metrics (list[str]) – List of metric names to compute

Returns:

Dictionary of computed metric values

Return type:

dict[str, float]