autointent.modules.scoring.DNNCScorer#

class autointent.modules.scoring.DNNCScorer(k=5, cross_encoder_config=None, embedder_config=None)#

Bases: autointent.modules.base.BaseScorer

Scoring module for intent classification using discriminative nearest neighbor classification.

This module uses a Ranker for scoring candidate intents and can optionally train a logistic regression head on top of cross-encoder features.

Parameters:

cross_encoder_config (autointent.configs.CrossEncoderConfig | str | dict[str, Any] | None) – Config of the cross-encoder model
embedder_config (autointent.configs.EmbedderConfig | str | dict[str, Any] | None) – Config of the embedder model
k (pydantic.PositiveInt) – Number of nearest neighbors to retrieve

Examples:#

from autointent.modules.scoring import DNNCScorer
utterances = ["what is your name?", "how are you?"]
labels = [0, 1]
scorer = DNNCScorer(
    cross_encoder_config="cross-encoder/ms-marco-MiniLM-L6-v2",
    embedder_config="sergeyzh/rubert-tiny-turbo",
    k=5,
)
scorer.fit(utterances, labels)

test_utterances = ["Hello!", "What's up?"]
scores = scorer.predict(test_utterances)

Reference:: Zhang, J. G., Hashimoto, K., Liu, W., Wu, C. S., Wan, Y., Yu, P. S., … & Xiong, C. (2020). Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference. arXiv preprint arXiv:2010.13009.

name = 'dnnc'#: Name of the module to reference in search space configuration.

supports_multilabel = False#: Whether the module supports multilabel classification

supports_multiclass = True#: Whether the module supports multiclass classification

cross_encoder_config#

embedder_config#

k = 5#

classmethod from_context(context, k=5, cross_encoder_config=None, embedder_config=None)#

Create a DNNCScorer instance using a Context object.

Parameters:

context (autointent.Context) – Context containing configurations and utilities
cross_encoder_config (autointent.configs.CrossEncoderConfig | str | None) – Config of the cross-encoder model
k (pydantic.PositiveInt) – Number of nearest neighbors to retrieve
embedder_config (autointent.configs.EmbedderConfig | str | None) – Config of the embedder model, or None to use the best embedder

Return type:

DNNCScorer

get_implicit_initialization_params()#

Return default params used in __init__ method.

Some parameters of the module may be inferred using context rather from __init__ method. But they need to be logged for reproducibility during loading from disk.

Returns:: Dictionary of default params
Return type:: dict[str, Any]

fit(utterances, labels)#

Fit the scorer by training or loading the vector index.

Parameters:

utterances (list[str]) – List of training utterances
labels (autointent.custom_types.ListOfLabels) – List of labels corresponding to the utterances

Raises:

ValueError – If the vector index mismatches the provided utterances

Return type:

None

predict(utterances)#

Predict class scores for the given utterances.

Parameters:: utterances (list[str]) – List of utterances to score
Returns:: Array of predicted scores
Return type:: numpy.typing.NDArray[Any]

predict_with_metadata(utterances)#

Predict class scores along with metadata for the given utterances.

Parameters:

utterances (list[str]) – List of utterances to score

Returns:

Array of predicted scores
List of metadata with neighbor details and scores

Return type:

Tuple containing

clear_cache()#

Clear cached data in memory used by the vector index.

Return type:: None