I have been looking for certain features in the HuggingFace
Trainer (in particular
Seq2SeqTrainer) and would like to know whether they exist and if so, how to implement them, or whether I would have to write my own training loop to enable them.
I am looking to apply Curriculum Learning to my training strategy, as well as evaluating the model at regular intervals, and therefore would like to enable the following
- choose in which order the model sees training samples at each epoch (it seems that the data passed onto the
train_datasetargument are automatically shuffled by some internal code, and even if I managed to stop that, I would still need to pass differently ordered data at different epochs, as I may want to start training the model from easy samples for a few epochs, and then pass a random shuffle of all data for later epochs)
- run custom evaluation at integer multiples of a fix number of steps. The standard
compute_metricsargument of the
Trainertakes a function to which the predictions and labels are passed* and the user can decide how to generate the metrics given these. However I’d like a finer level of control, for example changing the maximum sequence length for the tokenizer when doing the evaluation, as opposed to when doing training, which would require me including some explicit evaluation code inside
compute_metricswhich needs to access the trained model and the data from disk.
Can these two points be achieved by using the
Trainer on a multi-GPU machine, or would I have to write my own training loop?
*The function often looks something like this and I’m not sure it would work with the
Trainer if it doesn’t have this configuration
def compute_metrics(eval_pred): predictions, labels = eval_pred ...