Context storage benchmarking#

This module contains functions for context storages benchmarking.

The basic function is time_context_read_write() but it has a low level interface.

Higher level wrappers of the function provided by this module are:

Wrappers use BenchmarkConfig interface to configure benchmarks. A simple configuration class as well as a configuration set are provided by chatsky.utils.db_benchmark.basic_config.

To view files generated by save_results_to_file() use either report() or our streamlit app.

time_context_read_write(context_storage, context_factory, context_num, context_updater=None)[source]#

Benchmark context_storage by writing and reading contexts generated by context_factory into it / from it context_num times. If context_updater is not None it is used to update contexts and benchmark update operation.

This function clears context_storage before and after execution.

Parameters:
  • context_storage (DBContextStorage) – Context storage to benchmark.

  • context_factory (Callable[[], Context]) – A function that creates contexts which will be written into context storage.

  • context_num (int) – A number of times the context will be written and read.

  • context_updater (Optional[Callable[[Context], Optional[Context]]]) –

    None or a function. If not None, function should accept Context and return an updated Context. The updated context can be either the same object (at the same pointer) or a different object (e.g. copied). The updated context should have a higher dialog length than the received context (to emulate context updating during dialog). The function should return None to stop updating contexts. For an example of such function, see implementation of chatsky.utils.db_benchmark.basic_config.BasicBenchmarkConfig.context_updater().

    To avoid keeping many contexts in memory, this function will be called repeatedly at least context_num times.

Return type:

Tuple[List[float], List[Dict[int, float]], List[Dict[int, float]]]

Returns:

A tuple of 3 elements.

The first element – a list of write times. Its length is equal to context_num.

The second element – a list of dictionaries with read times. Each dictionary maps from int to float. The key in the mapping is the dialog_len of the context and the values are the read times for the corresponding dialog_len. If context_updater is None, all dictionaries will have only one key – dialog length of the context returned by context_factory. Otherwise, the dictionaries will also have a key for each updated context.

The third element – a list of dictionaries with update times. Structurally the same as the second element, but none of the elements here have a key for dialog_len of the context returned by context_factory. So if context_updater is None, all dictionaries will be empty.

class DBFactory(**data)[source]#

Bases: BaseModel

A class for storing information about context storage to benchmark. Also used to create a context storage from the configuration.

uri: str#

URI of the context storage.

factory_module: str#

A module containing factory.

factory: str#

Name of the context storage factory. (function that creates context storages from URIs)

db()[source]#

Create a context storage using factory from uri.

class BenchmarkConfig(**data)[source]#

Bases: BaseModel, ABC

Configuration for a benchmark.

Defines methods and parameters required to run time_context_read_write(). Also defines a method (info) for displaying information about this configuration.

A simple way to configure benchmarks is provided by BasicBenchmarkConfig.

Inherit from this class only if BasicBenchmarkConfig is not enough for your benchmarking needs.

context_num: int#

Number of times the contexts will be benchmarked. Increasing this number decreases standard error of the mean for benchmarked data.

abstract get_context()[source]#

Return context to benchmark read and write operations with.

This function will be called context_num times.

Return type:

Context

abstract info()[source]#

Return a dictionary with information about this configuration.

Return type:

Dict[str, Any]

abstract context_updater(context)[source]#

Update context with new dialog turns or return None to stop updates.

This function is used to benchmark update and read operations.

This function will be called AT LEAST context_num times.

Return type:

Optional[Context]

Returns:

Updated context or None to stop updating context.

class BenchmarkCase(**data)[source]#

Bases: BaseModel

This class represents a benchmark case and includes information about it, its configuration and configuration of a context storage to benchmark.

name: str#

Name of a benchmark case.

db_factory: DBFactory#

DBFactory that specifies context storage to benchmark.

benchmark_config: BenchmarkConfig#

Benchmark configuration.

uuid: str#

Unique id of the case. Defaults to a random uuid.

description: str#

Description of the case. Defaults to an empty string.

static set_average_results(benchmark)[source]#

Modify benchmark dictionary to include averaged benchmark results.

Add field “average_results” to the benchmark that contains the following fields:

  • average_write_time

  • average_read_time

  • average_update_time

  • read_times_grouped_by_context_num – a list of read times. Each element is the average of read times with the same context_num.

  • read_times_grouped_by_dialog_len – a dictionary of read times. Its values are the averages of read times with the same dialog_len, its keys are dialog_len values.

  • update_times_grouped_by_context_num

  • update_times_grouped_by_dialog_len

  • pretty_write – average write time with only 3 significant digits.

  • pretty_read

  • pretty_update

  • pretty_read+update – sum of average read and update times with only 3 significant digits.

Parameters:

benchmark – A dictionary returned by BenchmarkCase._run. Should include a “success” and “result” fields. “success” field should be true. “result” field should be a dictionary with the values returned by time_context_read_write() and keys “write_times”, “read_times” and “update_times”.

Returns:

None

_run()[source]#
run()[source]#

Run benchmark, return results.

Returns:

A dictionary with 3 keys: “success”, “result”, “average_results”.

Success is a bool value. It is false if an exception was raised during benchmarking.

Result is either an exception message or a dictionary with 3 keys (“write_times”, “read_times”, “update_times”). Values of those fields are the values returned by time_context_read_write().

Average results field is as described in set_average_results().

save_results_to_file(benchmark_cases, file, name, description, exist_ok=False)[source]#

Benchmark all benchmark_cases and save results to a file.

Result are saved in json format with this schema: utils/db_benchmark/benchmark_schema.json.

Files created by this function cen be viewed either by using report() or streamlit app located in the utils directory: utils/db_benchmark/benchmark_streamlit.py.

Parameters:
  • benchmark_cases (List[BenchmarkCase]) – A list of benchmark cases that specify benchmarks.

  • file (Union[str, Path]) – File to save results to.

  • name (str) – Name of the benchmark set.

  • description (str) – Description of the benchmark set.

  • exist_ok (bool) – Whether to continue if the file already exists.

benchmark_all(file, name, description, db_uri, benchmark_configs, exist_ok=False)[source]#

A wrapper for save_results_to_file().

Generates benchmark_cases from db_uri and benchmark_configs: db_uri is used to initialize DBFactory instance which is then used along with benchmark_configs to initialize BenchmarkCase instances.

Parameters:
  • file (Union[str, Path]) – File to save results to.

  • name (str) – Name of the benchmark set.

  • description (str) – Description of the benchmark set. The same description is used for benchmark cases.

  • db_uri (str) – URI of the database to benchmark

  • benchmark_configs (Dict[str, BenchmarkConfig]) – Mapping from case names to configs.

  • exist_ok (bool) – Whether to continue if the file already exists.