`train_test_split` with IterableDataset

Yes it’s not implemented right now but it should be possible to implement a train_test_split over the dataset shards. Contributions are welcome though if you’re interested in helping on this matter :slight_smile:

For now I’d suggest you to define two separate datasets, one with the train data files and one with the test data files