Context guide#
Introduction#
The Context
class is a backbone component of the Chatsky API.
Like the name suggests, this data structure is used to store information
about the current state, or context, of a particular conversation.
Each individual user has their own Context
instance and can be identified by it.
Context
is used to keep track of the user’s requests, bot’s replies,
user-related and request-related annotations, and any other information
that is relevant to the conversation with the user.
Note
Since most callback functions used in Chatsky script and Chatsky pipeline (see the basic guide)
need to either read or update the current dialog state,
the framework-level convention is that all functions of this kind
use Context
as their first parameter. This dependency is being
injected by the pipeline during its run.
Thus, understanding the Context
class is essential for developing custom conversation logic
which is mostly made up by the said functions.
As a callback parameter, Context
provides a convenient interface for working with data,
allowing developers to easily add, retrieve,
and manipulate data as the conversation progresses.
Let’s consider some of the built-in callback instances to see how the context can be leveraged:
1class Regexp(BaseCondition):
2 pattern: str
3
4 @cached_property
5 def re_object(self) -> Pattern:
6 return re.compile(self.pattern)
7
8 async def call(self, ctx: Context) -> bool:
9 request = ctx.last_request
10 if request.text is None:
11 return False
12 return bool(self.re_object.search(request.text))
The code above is a condition function (see the conditions tutorial)
that belongs to the TRANSITIONS
section of the script and returns True or False
depending on whether the current user request matches the given pattern.
As can be seen from the code block, the current
request (last_request
) can be retrieved as one of the attributes of the Context
object.
Likewise, the last_response
(bot’s current reply) or the last_label
(the name of the current node) attributes can be used in the same manner.
Another common use case is leveraging the misc
field (see below for a detailed description):
pipeline functions or PROCESSING
callbacks can write arbitrary values to the misc field,
making those available for other context-dependent functions.
1import urllib.request
2import urllib.error
3
4class PingExample(BaseProcessing):
5 async def call(self, ctx):
6 try:
7 with urllib.request.urlopen("https://example.com/") as webpage:
8 web_content = webpage.read().decode(
9 webpage.headers.get_content_charset()
10 )
11 result = "Example Domain" in web_content
12 except urllib.error.URLError:
13 result = False
14 ctx.misc["can_ping_example_com"] = result
API#
This sections describes the API of the Context
class.
For more information, such as method signatures, see API reference.
Attributes#
id: This attribute represents the unique context identifier. By default, it is randomly generated using uuid4. In most cases, this attribute will be used to identify a user.
labels: The labels attribute stores the history of all passed labels within the conversation. It maps turn IDs to labels. The collection is ordered, so getting the last item of the mapping always shows the last visited node.
Note that labels only stores the nodes that were transitioned to so start_label will not be in this attribute.
requests: The requests attribute maintains the history of all received requests by the agent. It also maps turn IDs to requests. Like labels, it stores the requests in-order.
responses: This attribute keeps a record of all agent responses, mapping turn IDs to responses. Stores the responses in-order.
misc: The misc attribute is a dictionary for storing custom data. This field is not used by any of the built-in Chatsky classes or functions, so the values that you write there are guaranteed to persist throughout the lifetime of the
Context
object.framework_data: This attribute is used for storing custom data required for pipeline execution. It is meant to be used by the framework only. Accessing it may result in pipeline breakage. But there are some methods that provide access to specific fields of framework data. These methods are described in the next section.
Methods#
The methods of the Context
class can be divided into two categories:
Public methods that get called manually in custom callbacks and in functions that depend on the context.
Methods that are not designed for manual calls and get called automatically during pipeline runs, i.e. quasi-private methods. You may still need them when developing extensions or heavily modifying Chatsky.
Public methods#
last_request: Return the last request of the context.
last_response: Return the last response of the context, or None if the
responses
field is empty.Responses are added at the end of each turn, so an empty
response
field is something you should definitely consider.last_label: Return the last node label of the context (i.e. name of the current node).
clear: Clear all items from context fields, optionally keeping the data from hold_last_n_indices turns. You can specify which fields to clear using the field_names parameter. This method is designed for cases when contexts are shared over high latency networks.
current_node: Return the current node of the context. Use this property to access properties of the current node. You can safely modify properties of this. The changes will be reflected in bot behaviour during this turn, bot are not permanent (the node stored inside the script is not changed).
Note
See the preprocessing tutorial.
pipeline: Return
Pipeline
object that is used to process this context. This can be used to getScript
,start_label
orfallback_label
.
Private methods#
These methods should not be used outside of the internal workings.
set_last_response
set_last_request
add_request
add_response
add_label
Context storages#
Since context instances contain all the information, relevant for a particular user, there needs to be a way
to persistently store that information and to make it accessible in different user sessions.
This functionality is implemented by the context storages
module that provides
the uniform DBContextStorage
interface as well as child classes thereof that integrate
various database types (see the
api reference).
The supported storage options are as follows:
DBContextStorage
instances can be uniformly constructed using the context_storage_factory
function.
The function’s only parameter is a connection string that specifies both the database type
and the connection parameters, for example, mongodb://admin:pass@localhost:27016/admin.
(see the reference)
Note
To learn how to use context_storage_factory
in your pipeline, see our Context Storage Tutorials.
The GitHub-based distribution of Chatsky includes Docker images for each of the supported database types. Therefore, the easiest way to deploy your service together with a database is to clone the GitHub distribution and to take advantage of the packaged docker compose file.
1git clone https://github.com/deeppavlov/chatsky.git
2cd chatsky
3# assuming we need to deploy mongodb
4docker compose up mongo
The images can be configured using the docker compose file or the environment file, also available in the distribution. Consult these files for more options.
Warning
The data transmission protocols require the data to be JSON-serializable. Chatsky tackles this problem
through utilization of pydantic
as described in the next section.
Serialization#
The fact that the Context
class is a Pydantic model makes it easily convertible to other data formats,
such as JSON. For instance, as a developer, you don’t need to implement instructions on how datetime fields
need to be marshalled, since this functionality is provided by Pydantic out of the box.
As a result, working with web interfaces and databases that require the transmitted data to be serialized
becomes as easy as calling the model_dump_json method:
serialized_context = context.model_dump_json()
Knowing that, you can easily extend Chatsky to work with storages like Memcache or web APIs of your liking.