This line of code here has been working for months for me… However. today I ran the code and its generating the train split. The ‘split’ argument is being ignored.
dataset_wmt_enfr = load_dataset("wmt14",'fr-en', split='test')
Has anyone else run into this issue? I saw there was a change to the repository a couple of days ago…
I cannot reproduce the issue.
I did the latest changes to the dataset, but these were just:
I guess until now you were using the dataset that you had in your local cache, and now that the dataset has been updated on the Hub, the library tries to regenerate it and caching it again without success. So I think this could happen if you have a very old version of
datasets. Could you please verify?
If this is the case, I recommend you to update it:
pip install -U datasets
If after all, the problem persists on your side, I would ask you to include the complete stack trace error and information about your environment (by running the shell command