one suggestion would be to use the split
functionality of datasets
to create your folds as described here: Splits and slicing — datasets 1.6.0 documentation
then you could use a loop to fine-tune on each fold with the trainer and aggregate the predictions per fold