For a project, I am trying to split a data set in a training, validation, and testing data set. In my data, one individual can have multiple entries that are independent of each other. However, I would like to create such splits that one ID is only part of one split. Thus, that instances of one individual (one ID) are only present in either the training, validation, or testing data. I cannot find a solution within the Datasets library for this, does anyone know if this exists or what approach would be best in this case?
For context, I know that sklearn allows to do this through their GroupShuffleSplit and I am looking for something similar for the Datasets component.