autointent.Dataset#
- class autointent.Dataset(*args, intents, **kwargs)#
Bases:
dict
[str
,datasets.Dataset
]Represents a dataset with associated metadata and utilities for processing.
This class extends a dictionary where the keys represent dataset splits (e.g., ‘train’, ‘test’), and the values are Hugging Face datasets.
- Parameters:
args (Any)
intents (list[autointent.schemas.Intent])
kwargs (Any)
- intents: list[autointent.schemas.Intent]#
All metadata about intents used in this dataset.
- classmethod from_dict(mapping)#
Creates a dataset from a dictionary mapping.
- classmethod from_json(filepath)#
Loads a dataset from a JSON file.
- Parameters:
filepath (str | pathlib.Path) – Path to the JSON file.
- Return type:
- classmethod from_hub(repo_name)#
Loads a dataset from the Hugging Face Hub.
- to_dict()#
Converts the dataset into a dictionary format.
Returns a dictionary where the keys are dataset splits and the values are lists of samples.
- to_json(filepath)#
Saves the dataset to a JSON file.
- Parameters:
filepath (str | pathlib.Path) – The file path where the dataset should be saved.
- Return type:
None
- push_to_hub(repo_name, private=False)#
Uploads the dataset to the Hugging Face Hub.
- get_tags()#
Extracts unique tags from the dataset’s intents.
- Return type:
- get_n_classes(split)#
Calculates the number of unique classes in a dataset split.