Yes it’s not implemented right now but it should be possible to implement a train_test_split over the dataset shards. Contributions are welcome though if you’re interested in helping on this matter
For now I’d suggest you to define two separate datasets, one with the train data files and one with the test data files