Hi there,
I am wondering, what is currently the most elegant way to perform a three-way random split (into train, val and test set)? Let’s assume I load_dataset so that:
Dataset({
features: ['text'],
num_rows: 19122
})
Subsequently, I’d like to perform the split. Currently I am performing dataset.train_test_split() twice and then recombine the three datasets into one using DatasetDict. However, I assume that this is not the most elegant approach right? I also experimented with ReadInstructions, however, I could only split the data deterministically instead of randomly…
Any one got a better soultion? 