Inference Pipeline#
After you configured optimal pipeline with AutoIntent, you probably want to test its power on some new data! There are several options:
use it right after optimization
save to file system and then load
Right After#
Here’s the basic example:
[1]:
from autointent import Dataset, Pipeline
dataset = Dataset.from_hub("AutoIntent/clinc150_subset")
pipeline = Pipeline.default_optimizer(multilabel=False)
context = pipeline.fit(dataset)
pipeline.predict(["hello, world!"])
/home/runner/.cache/pypoetry/virtualenvs/autointent-FDypUDHQ-py3.10/lib/python3.10/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py:11: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from tqdm.autonotebook import tqdm, trange
No sentence-transformers model found with name infgrad/stella-base-en-v2. Creating a new one with mean pooling.
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
No sentence-transformers model found with name infgrad/stella-base-en-v2. Creating a new one with mean pooling.
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
[1]:
array([2])
There are several caveats.
Save vector databse.
When customizing configuration of pipeline optimization, you need to ensure that the option save_db
of VectorIndexConfig is set to True
:
[2]:
from autointent.configs import VectorIndexConfig
# isn't compatible with "right-after-optimization" inference
vector_index_config = VectorIndexConfig(save_db=False)
RAM usage.
You can optimize RAM usage by saving all modules to file system. Just set the following options:
[3]:
from autointent.configs import LoggingConfig
logging_config = LoggingConfig(dump_modules=True, clear_ram=True)
Load from File System#
Firstly, your auto-configuration run should dump modules into file system:
[4]:
from pathlib import Path
from autointent import Dataset, Pipeline
from autointent.configs import LoggingConfig, VectorIndexConfig
dataset = Dataset.from_hub("AutoIntent/clinc150_subset")
pipeline = Pipeline.default_optimizer(multilabel=False)
dump_dir = Path("my_dumps")
pipeline.set_config(LoggingConfig(dump_dir=dump_dir, dump_modules=True, clear_ram=True))
pipeline.set_config(VectorIndexConfig(save_db=True))
Secondly, after optimization finished, you need to save the auto-configuration results to file system:
[5]:
context = pipeline.fit(dataset)
context.dump()
No sentence-transformers model found with name infgrad/stella-base-en-v2. Creating a new one with mean pooling.
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
This command saves all results to the run’s directory:
[6]:
run_directory = context.logging_config.dirpath
run_directory
[6]:
PosixPath('/tmp/tmpdd2b80j0/4cc3b99c8c123e4eafd085a807f245b67cead082/docs/source/user_guides/runs/dull_duck_12-24-2024_22-06-43')
After that, you can load pipeline for inference:
[7]:
loaded_pipeline = Pipeline.load(run_directory)
loaded_pipeline.predict(["hello, world!"])
[7]:
array([2])
That’s all!#
[8]:
# [you didn't see it]
import shutil
shutil.rmtree(dump_dir)
for file in Path.cwd().glob("vector_db*"):
shutil.rmtree(file)