autointent.modules.scoring.BertScorer#

class autointent.modules.scoring.BertScorer(classification_model_config=None, num_train_epochs=3, batch_size=8, learning_rate=5e-05, seed=0, report_to='none', early_stopping_config=None, print_progress=False)#

Bases: autointent.modules.base.BaseScorer

Scoring module for transformer-based classification using BERT models.

This module uses a transformer model (like BERT) to perform intent classification. It supports both multiclass and multilabel classification tasks, with options for early stopping and various training configurations.

Parameters:

classification_model_config (autointent.configs.HFModelConfig | str | dict[str, Any] | None) – Config of the transformer model (HFModelConfig, str, or dict)
num_train_epochs (int) – Number of training epochs (default: 3)
batch_size (int) – Batch size for training (default: 8)
learning_rate (float) – Learning rate for training (default: 5e-5)
seed (int) – Random seed for reproducibility (default: 0)
report_to (autointent._callbacks.REPORTERS_NAMES | Literal['none']) – Reporting tool for training logs (e.g., “wandb”, “tensorboard”)
early_stopping_config (autointent.configs.EarlyStoppingConfig | dict[str, Any] | None) – Configuration for early stopping during training
print_progress (bool)

Example:#

from autointent.modules import BertScorer

# Initialize scorer with BERT model
scorer = BertScorer(
    classification_model_config="bert-base-uncased",
    num_train_epochs=3,
    batch_size=8,
    learning_rate=5e-5,
    seed=42
)

# Training data
utterances = ["This is great!", "I didn't like it", "Awesome product", "Poor quality"]
labels = [1, 0, 1, 0]

# Fit the model
scorer.fit(utterances, labels)

# Make predictions
test_utterances = ["Good product", "Not worth it"]
probabilities = scorer.predict(test_utterances)

name = 'bert'#: Name of the module to reference in search space configuration.

supports_multiclass = True#: Whether the module supports multiclass classification

supports_multilabel = True#: Whether the module supports multilabel classification

classification_model_config#

num_train_epochs = 3#

batch_size = 8#

learning_rate = 5e-05#

seed = 0#

report_to = 'none'#

early_stopping_config#

print_progress = False#

classmethod from_context(context, classification_model_config=None, num_train_epochs=3, batch_size=8, learning_rate=5e-05, seed=0, early_stopping_config=None)#

Initialize self from context.

Parameters:

context (autointent.Context) – Context to init from
**kwargs – Additional kwargs
classification_model_config (autointent.configs.HFModelConfig | str | dict[str, Any] | None)
num_train_epochs (int)
batch_size (int)
learning_rate (float)
seed (int)
early_stopping_config (autointent.configs.EarlyStoppingConfig | dict[str, Any] | None)

Returns:

Initialized module

Return type:

BertScorer

get_implicit_initialization_params()#

Return default params used in __init__ method.

Some parameters of the module may be inferred using context rather from __init__ method. But they need to be logged for reproducibility during loading from disk.

Returns:: Dictionary of default params
Return type:: dict[str, Any]

fit(utterances, labels)#

Fit the scoring module to the training data.

Parameters:

utterances (list[str]) – List of training utterances.
labels (autointent.custom_types.ListOfLabels) – List of training labels.

Return type:

None

predict(utterances)#

Predict scores for a list of utterances.

Parameters:: utterances (list[str]) – List of utterances to score.
Returns:: Array of predicted scores.
Return type:: numpy.typing.NDArray[Any]

clear_cache()#

Clear cache.

Return type:: None