Run reporting#

This script demonstrates how to report the optimization process using the AutoIntent library.

[1]:
search_space = [
    {
        "node_type": "embedding",
        "target_metric": "retrieval_hit_rate",
        "search_space": [
            {
                "module_name": "retrieval",
                "k": [10],
                "embedder_config": ["avsolatorio/GIST-small-Embedding-v0", "sergeyzh/rubert-tiny-turbo"],
            }
        ],
    },
    {
        "node_type": "scoring",
        "target_metric": "scoring_roc_auc",
        "search_space": [
            {"module_name": "knn", "k": [1, 3, 5, 10], "weights": ["uniform", "distance", "closest"]},
            {"module_name": "linear"},
            {
                "module_name": "dnnc",
                "cross_encoder_config": ["cross-encoder/ms-marco-MiniLM-L6-v2"],
                "k": [1, 3, 5, 10],
            },
        ],
    },
    {
        "node_type": "decision",
        "target_metric": "decision_accuracy",
        "search_space": [{"module_name": "threshold", "thresh": [0.5]}, {"module_name": "argmax"}],
    },
]

Load Data#

Let us use small subset of popular clinc150 dataset:

[2]:

from autointent import Dataset dataset = Dataset.from_hub("DeepPavlov/clinc150_subset")
/home/runner/work/AutoIntent/AutoIntent/.venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Start Auto Configuration#

[3]:
from autointent import Pipeline

pipeline_optimizer = Pipeline.from_search_space(search_space)

Reporting#

Currently supported reporting options are:

  • tensorboard

  • wandb

[4]:
from autointent.configs import LoggingConfig
from pathlib import Path

log_config = LoggingConfig(
    run_name="test_tensorboard", report_to=["tensorboard"], project_dir=Path("my_projects"), dump_modules=False
)

pipeline_optimizer.set_config(log_config)
[5]:
pipeline_optimizer.fit(dataset)
Memory storage is not compatible with resuming optimization. Modules from previous runs won't be available. Set dump_modules=True in LoggingConfig to enable proper resuming.
Storage directory must be provided for study persistence.
[I 2026-05-20 09:48:52,678] A new study created in memory with name: NodeType.embedding
Storage directory must be provided for study persistence.
/home/runner/work/AutoIntent/AutoIntent/.venv/lib/python3.12/site-packages/sklearn/linear_model/_logistic.py:1780: FutureWarning: The default value for l1_ratios will change from None to (0.0,) in version 1.10. From version 1.10 onwards, only array-like with values in [0, 1] will be allowed, None will be forbidden. To avoid this warning, explicitly set a value, e.g. l1_ratios=(0,).
  warnings.warn(
/home/runner/work/AutoIntent/AutoIntent/.venv/lib/python3.12/site-packages/sklearn/linear_model/_logistic.py:1823: FutureWarning: The fitted attributes of LogisticRegressionCV will be simplified in scikit-learn 1.10 to remove redundancy. Set`use_legacy_attributes=False` to enable the new behavior now, or set it to `True` to silence this warning during the transition period while keeping the deprecated behavior for the time being. The default value of use_legacy_attributes will change from True to False in scikit-learn 1.10. See the docstring of LogisticRegressionCV for more details.
  warnings.warn(
Storage directory must be provided for study persistence.
"argmax" is NOT designed to handle OOS samples, but your data contains it. So, using this method reduces the power of classification.
/home/runner/work/AutoIntent/AutoIntent/.venv/lib/python3.12/site-packages/sklearn/metrics/_classification.py:1833: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, f"{metric.capitalize()} is", result.shape[0])
[5]:
<autointent.context._context.Context at 0x7fc37ca27fe0>

Now results of the optimization process can be viewed in the tensorboard.

tensorboard --logdir test_tensorboard