I am loading my dataset from a local file, and I’m getting error “TypeError: new(): invalid data type ‘numpy.str_’” which I believe is due to the features not being defined
It’s mentioned here and a solution is to pass a features dictionary when loading. But I am having trouble with the format.
I’ve tried things like :
emotions = load_dataset("csv", data_files="train.txt", sep=";",
names=["text", "label"],features = {'text': datasets.Value(dtype='int32', id=None),
'label':datasets.ClassLabel(num_classes=2, names=['not_equivalent', 'equivalent'], names_file=None, id=None)})
and
load_dataset("csv", data_files="train.txt", sep=";",
names=["text", "label"],features = {'text': 'str',
'label':['not_equivalent', 'equivalent']})
Without success.
I’m trying to follow the documentation here but can’t seem to figure it out…is there an example of how to do this somewhere? Thanks!
https://huggingface.co/docs/datasets/_modules/datasets/features/features.html#Features