I’ve found several issues with the “Training a causal language model from scratch” tutorial, you can find those below:
- In the “Training with
Accelerate” section, the variable ‘tokenized_dataset’ should be corrected to ‘tokenized_datasets’. This appears to be a typo. - There is a deprecation issue to note: Replace
Accelerator(fp16=True)withAccelerator(mixed_precision='fp16')to use the updated syntax. - Regarding the evaluation with the accelerator, the correct code should be
losses.append(accelerator.gather(outputs.loss.view(-1)))sinceoutputs.lossreturns a scalar and does not have a shape. - In the accelerator training loop, the variable ‘samples_per_step’ is undefined. I assume it should be the same as ‘batch_size’, which was 32.
i hope this helps, best.