Inference Pipeline#

After you configured optimal pipeline with AutoIntent, you probably want to test its power on some new data! There are several options:

  • use it right after optimization

  • save to file system and then load

Right After#

Here’s the basic example:

[1]:
from autointent import Dataset, Pipeline

dataset = Dataset.from_hub("AutoIntent/clinc150_subset")
pipeline = Pipeline.default_optimizer(multilabel=False)
context = pipeline.fit(dataset)
pipeline.predict(["hello, world!"])
/home/runner/.cache/pypoetry/virtualenvs/autointent-FDypUDHQ-py3.10/lib/python3.10/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:11: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from tqdm.autonotebook import tqdm, trange
No sentence-transformers model found with name infgrad/stella-base-en-v2. Creating a new one with mean pooling.
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
No sentence-transformers model found with name infgrad/stella-base-en-v2. Creating a new one with mean pooling.
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
[1]:
array([2])

There are several caveats.

  1. Save vector databse.

When customizing configuration of pipeline optimization, you need to ensure that the option save_db of VectorIndexConfig is set to True:

[2]:
from autointent.configs import VectorIndexConfig

# isn't compatible with "right-after-optimization" inference
vector_index_config = VectorIndexConfig(save_db=False)
  1. RAM usage.

You can optimize RAM usage by saving all modules to file system. Just set the following options:

[3]:
from autointent.configs import LoggingConfig

logging_config = LoggingConfig(dump_modules=True, clear_ram=True)

Load from File System#

Firstly, your auto-configuration run should dump modules into file system:

[4]:
from pathlib import Path

from autointent import Dataset, Pipeline
from autointent.configs import LoggingConfig, VectorIndexConfig

dataset = Dataset.from_hub("AutoIntent/clinc150_subset")
pipeline = Pipeline.default_optimizer(multilabel=False)
dump_dir = Path("my_dumps")
pipeline.set_config(LoggingConfig(dump_dir=dump_dir, dump_modules=True, clear_ram=True))
pipeline.set_config(VectorIndexConfig(save_db=True))

Secondly, after optimization finished, you need to save the auto-configuration results to file system:

[5]:
context = pipeline.fit(dataset)
context.dump()
No sentence-transformers model found with name infgrad/stella-base-en-v2. Creating a new one with mean pooling.
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)

This command saves all results to the run’s directory:

[6]:
run_directory = context.logging_config.dirpath
run_directory
[6]:
PosixPath('/tmp/tmpdd2b80j0/4cc3b99c8c123e4eafd085a807f245b67cead082/docs/source/user_guides/runs/dull_duck_12-24-2024_22-06-43')

After that, you can load pipeline for inference:

[7]:
loaded_pipeline = Pipeline.load(run_directory)
loaded_pipeline.predict(["hello, world!"])
[7]:
array([2])

That’s all!#

[8]:
# [you didn't see it]
import shutil

shutil.rmtree(dump_dir)

for file in Path.cwd().glob("vector_db*"):
    shutil.rmtree(file)