Suppose I have a dataset d
and I wanna have a copy of the whole dataset, something similar to torch.clone()
for tensors and copy.deepcopy
for regular python objects. Is there a way I could achieve this using datasets
?
Why not using copy.deepcopy
?
As answered in this question: What is the diffrence between copy.deepcopy and flatten_indices? - #2 by lhoestq , copy.deepcopy
will create a copy of dataset.
It is just a few days that I’m using transformers
and datasets
, but up until now, everything I did with a datasets
object, was without mutation, for example sorting, shuffling, selecting, filtering and using map does not change your datasets
object, when there is no mutation, most of the times you don’t need to clone anything. So make sure if you really need to make a copy of datasets
object.