I want to speed up the
Generating train split:
step of loading the cornell_movie_dialog
dataset.
I figured if I should be able to load a subset of the dataset, and the generation should go faster.
Is it possible to do that?
I want to speed up the
Generating train split:
step of loading the cornell_movie_dialog
dataset.
I figured if I should be able to load a subset of the dataset, and the generation should go faster.
Is it possible to do that?
You should be able to index into a dataset, so try using cornell_movie_dialog[:1000] to see if that works
It doesn’t. The
Generating train split:
would still take all 80-ish thousand entries.