Hi! I want to apply static quantization on my model using my own dataset for calibration.
There is no special explanation on how to use get_calibration_dataset with “external dataset”. As it’s not the same than with Hugging Face dataset, i don’t get if i need to convert my dataset to a special format or specify any configuration. I’ve tried with the path to my csv, even tried in json.
update : it seems to be mandatory to use a dataset from datasets. As load_dataset is called in the function and args seem to be for HG dataset (no data_files argument available)
Will appreciate any info
thanks!
1 Like
hi @fxmarty,
The problem is that in the source code of quantizer.get_calibration_dataset() load_dataset is set like :
calib_dataset = load_dataset(
dataset_name,
name=dataset_config_name,
split=dataset_split,
use_auth_token=use_auth_token,
)
as you can see it’s not possible to use what is explained in Load tabular data.
In the example, they use glue which is easier but i want to calibrate my quantization on my own dataset as it matches more my requirements and my use case.
Hi @aroger, so if dataset_name="csv"
and a way to provide the data_files
argument in get_calibration_dataset
is provided that should work. Would you like to open a PR to enable this?