I’m not able to load the pile dataset. Here is the code I’m using
dataset_name = "EleutherAI/the_pile"
dataset = load_dataset(dataset_name, subsets = ['hacker_news', 'enron_emails'])
Here is the error I’m getting
TypeError: Couldn't cast array of type
struct<file: string, id: string>
to
{'id': Value(dtype='string', id=None)}
Any ideas why?