autointent.modules.scoring.DNNCScorer#
- class autointent.modules.scoring.DNNCScorer(k, cross_encoder_config=None, embedder_config=None)#
Bases:
autointent.modules.base.BaseScorer
Scoring module for intent classification using discriminative nearest neighbor classification.
This module uses a Ranker for scoring candidate intents and can optionally train a logistic regression head on top of cross-encoder features.
- Parameters:
cross_encoder_config (autointent.configs.CrossEncoderConfig | str | dict[str, Any] | None) – Config of the cross-encoder model
embedder_config (autointent.configs.EmbedderConfig | str | dict[str, Any] | None) – Config of the embedder model
k (pydantic.PositiveInt) – Number of nearest neighbors to retrieve
Examples:#
from autointent.modules.scoring import DNNCScorer utterances = ["what is your name?", "how are you?"] labels = [0, 1] scorer = DNNCScorer( cross_encoder_config="cross-encoder/ms-marco-MiniLM-L-6-v2", embedder_config="sergeyzh/rubert-tiny-turbo", k=5, ) scorer.fit(utterances, labels) test_utterances = ["Hello!", "What's up?"] scores = scorer.predict(test_utterances)
- Reference:
Zhang, J. G., Hashimoto, K., Liu, W., Wu, C. S., Wan, Y., Yu, P. S., … & Xiong, C. (2020). Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference. arXiv preprint arXiv:2010.13009.
- name = 'dnnc'#
Name of the module.
- supports_multilabel = False#
Whether the module supports multilabel classification
- supports_multiclass = True#
Whether the module supports multiclass classification
- cross_encoder_config#
- embedder_config#
- k#
- classmethod from_context(context, k, cross_encoder_config=None, embedder_config=None)#
Create a DNNCScorer instance using a Context object.
- Parameters:
context (autointent.Context) – Context containing configurations and utilities
cross_encoder_config (autointent.configs.CrossEncoderConfig | str | None) – Config of the cross-encoder model
k (pydantic.PositiveInt) – Number of nearest neighbors to retrieve
embedder_config (autointent.configs.EmbedderConfig | str | None) – Config of the embedder model, or None to use the best embedder
- Return type:
- fit(utterances, labels)#
Fit the scorer by training or loading the vector index.
- Parameters:
- Raises:
ValueError – If the vector index mismatches the provided utterances
- Return type:
None
- predict(utterances)#
Predict class scores for the given utterances.
- predict_with_metadata(utterances)#
Predict class scores along with metadata for the given utterances.
- clear_cache()#
Clear cached data in memory used by the vector index.
- Return type:
None