This line of code here has been working for months for me… However. today I ran the code and its generating the train split. The ‘split’ argument is being ignored.
I guess until now you were using the dataset that you had in your local cache, and now that the dataset has been updated on the Hub, the library tries to regenerate it and caching it again without success. So I think this could happen if you have a very old version of datasets. Could you please verify?
import datasets
datasets.__version__
If this is the case, I recommend you to update it:
pip install -U datasets
If after all, the problem persists on your side, I would ask you to include the complete stack trace error and information about your environment (by running the shell command datasets-cli env).