This is my huggingface datasets repo link
I want to use my custom image dataset for object detection model training.
Then I test for loading dataset:
from datasets import load_dataset
dataset = load_dataset(βzhuchi76/Boat_datasetβ, trust_remote_code=True)
Then I get the error:
root@arg10:~/dataset/Boat_dataset/Boat_dataset_hf/Boat_dataset# python3 test_load_dataset.py
Downloading builder script: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 7.93k/7.93k [00:00<00:00, 36.0MB/s]
Traceback (most recent call last):
File βtest_load_dataset.pyβ, line 4, in
dataset = load_dataset(βzhuchi76/Boat_datasetβ, trust_remote_code=True)
File β/usr/local/lib/python3.8/dist-packages/datasets/load.pyβ, line 2556, in load_dataset
builder_instance = load_dataset_builder(
File β/usr/local/lib/python3.8/dist-packages/datasets/load.pyβ, line 2265, in load_dataset_builder
builder_instance: DatasetBuilder = builder_cls(
File β/usr/local/lib/python3.8/dist-packages/datasets/builder.pyβ, line 381, in init
info = self.get_exported_dataset_info()
File β/usr/local/lib/python3.8/dist-packages/datasets/builder.pyβ, line 557, in get_exported_dataset_info
return self.get_all_exported_dataset_infos().get(self.config.name, DatasetInfo())
File β/usr/local/lib/python3.8/dist-packages/datasets/builder.pyβ, line 543, in get_all_exported_dataset_infos
return DatasetInfosDict.from_directory(cls.get_imported_module_dir())
File β/usr/local/lib/python3.8/dist-packages/datasets/info.pyβ, line 437, in from_directory
return cls.from_dataset_card_data(dataset_card_data)
File β/usr/local/lib/python3.8/dist-packages/datasets/info.pyβ, line 463, in from_dataset_card_data
dataset_info = DatasetInfo._from_yaml_dict(dataset_card_data[βdataset_infoβ])
File β/usr/local/lib/python3.8/dist-packages/datasets/info.pyβ, line 394, in _from_yaml_dict
yaml_data[βfeaturesβ] = Features._from_yaml_list(yaml_data[βfeaturesβ])
File β/usr/local/lib/python3.8/dist-packages/datasets/features/features.pyβ, line 1875, in _from_yaml_list
return cls.from_dict(from_yaml_inner(yaml_data))
File β/usr/local/lib/python3.8/dist-packages/datasets/features/features.pyβ, line 1715, in from_dict
obj = generate_from_dict(dic)
File β/usr/local/lib/python3.8/dist-packages/datasets/features/features.pyβ, line 1361, in generate_from_dict
return {key: generate_from_dict(value) for key, value in obj.items()}
File β/usr/local/lib/python3.8/dist-packages/datasets/features/features.pyβ, line 1361, in
return {key: generate_from_dict(value) for key, value in obj.items()}
File β/usr/local/lib/python3.8/dist-packages/datasets/features/features.pyβ, line 1366, in generate_from_dict
return Sequence(feature=generate_from_dict(obj[βfeatureβ]), length=obj.get(βlengthβ, -1))
File β/usr/local/lib/python3.8/dist-packages/datasets/features/features.pyβ, line 1361, in generate_from_dict
return {key: generate_from_dict(value) for key, value in obj.items()}
File β/usr/local/lib/python3.8/dist-packages/datasets/features/features.pyβ, line 1361, in
return {key: generate_from_dict(value) for key, value in obj.items()}
File β/usr/local/lib/python3.8/dist-packages/datasets/features/features.pyβ, line 1369, in generate_from_dict
return class_type(**{k: v for k, v in obj.items() if k in field_names})
File ββ, line 5, in init
File β/usr/local/lib/python3.8/dist-packages/datasets/features/features.pyβ, line 1001, in post_init
raise ValueError(βSome label names are duplicated. Each label name should be unique.β)
ValueError: Some label names are duplicated. Each label name should be unique.
I have checked the annotation files: instances_train2023.jsonl and instances_val2023.jsonl. There are no duplicate labels in these files.
How can I fix this error?