Inference Pipeline#
After you configured optimal pipeline with AutoIntent, you probably want to test its power on some new data! There are several options:
use it right after optimization
save to file system and then load
Right After#
Here’s the basic example:
[1]:
from autointent import Dataset, Pipeline
search_space = [
{
"node_type": "scoring",
"target_metric": "scoring_roc_auc",
"search_space": [
{
"module_name": "knn",
"k": [1],
"weights": ["uniform"],
"embedder_config": ["avsolatorio/GIST-small-Embedding-v0"],
},
],
},
{
"node_type": "decision",
"target_metric": "decision_accuracy",
"search_space": [
{"module_name": "threshold", "thresh": [0.5]},
{"module_name": "argmax"},
],
},
]
dataset = Dataset.from_hub("AutoIntent/clinc150_subset")
pipeline = Pipeline.from_search_space(search_space)
context = pipeline.fit(dataset)
pipeline.predict(["hello, world!"])
/home/runner/.cache/pypoetry/virtualenvs/autointent-FDypUDHQ-py3.10/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
/home/runner/work/AutoIntent/AutoIntent/autointent/nodes/_node_optimizer.py:99: ExperimentalWarning: BruteForceSampler is experimental (supported from v3.1.0). The interface can change in the future.
sampler_instance = optuna.samplers.BruteForceSampler(seed=context.seed) # type: ignore[assignment]
[I 2025-03-08 22:28:12,755] A new study created in memory with name: no-name-ea594b1b-909e-44cd-bf8e-739e9ac220c2
/home/runner/work/AutoIntent/AutoIntent/autointent/nodes/_node_optimizer.py:99: ExperimentalWarning: BruteForceSampler is experimental (supported from v3.1.0). The interface can change in the future.
sampler_instance = optuna.samplers.BruteForceSampler(seed=context.seed) # type: ignore[assignment]
"argmax" is NOT designed to handle OOS samples, but your data contains it. So, using this method reduces the power of classification.
/home/runner/.cache/pypoetry/virtualenvs/autointent-FDypUDHQ-py3.10/lib/python3.10/site-packages/sklearn/metrics/_classification.py:1565: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
[1]:
[3]
There are several caveats.
RAM usage.
You can optimize RAM usage by saving all modules to file system. Just set the following options:
[2]:
from autointent.configs import LoggingConfig
logging_config = LoggingConfig(dump_modules=True, clear_ram=True)
Load from File System#
Firstly, your auto-configuration run should dump modules into file system:
[3]:
from autointent import Dataset, Pipeline
from autointent.configs import LoggingConfig
dataset = Dataset.from_hub("AutoIntent/clinc150_subset")
pipeline = Pipeline.from_search_space(search_space)
pipeline.set_config(LoggingConfig(dump_modules=True, clear_ram=True))
Secondly, after optimization finished, you need to save the auto-configuration results to file system:
[4]:
context = pipeline.fit(dataset)
context.dump()
# or pipeline.dump() to save only configured pipeline but not all the optimization assets
/home/runner/work/AutoIntent/AutoIntent/autointent/nodes/_node_optimizer.py:99: ExperimentalWarning: BruteForceSampler is experimental (supported from v3.1.0). The interface can change in the future.
sampler_instance = optuna.samplers.BruteForceSampler(seed=context.seed) # type: ignore[assignment]
Attribute _artifact of type <class 'autointent.context.optimization_info._data_models.ScorerArtifact'> cannot be dumped to file system.
/home/runner/work/AutoIntent/AutoIntent/autointent/nodes/_node_optimizer.py:99: ExperimentalWarning: BruteForceSampler is experimental (supported from v3.1.0). The interface can change in the future.
sampler_instance = optuna.samplers.BruteForceSampler(seed=context.seed) # type: ignore[assignment]
Attribute _artifact of type <class 'autointent.context.optimization_info._data_models.DecisionArtifact'> cannot be dumped to file system.
"argmax" is NOT designed to handle OOS samples, but your data contains it. So, using this method reduces the power of classification.
Attribute _artifact of type <class 'autointent.context.optimization_info._data_models.DecisionArtifact'> cannot be dumped to file system.
/home/runner/.cache/pypoetry/virtualenvs/autointent-FDypUDHQ-py3.10/lib/python3.10/site-packages/sklearn/metrics/_classification.py:1565: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
This command saves all results to the run’s directory:
[5]:
run_directory = context.logging_config.dirpath
run_directory
[5]:
PosixPath('/tmp/tmpgmvqaei_/a35046b30e54481ace5bc9d948e1af8ef3d3eb86/docs/source/user_guides/runs/smiling_salamander_03-08-2025_22-28-20')
After that, you can load pipeline for inference:
[6]:
loaded_pipeline = Pipeline.load(run_directory)
loaded_pipeline.predict(["hello, world!"])
[6]:
[3]