I have assembled this colab notebook to finetune LLaMA 7B adapter weights. Within TrainingArguments
, in line 62, I would like to use logging_steps=1
so I can see the logs at every step. However, I get
ValueError: expected sequence of length 24 at dim 1 (got 58)
when I do use logging_steps=1
and as a consequence, the model won’t be trained.
Why is that? And how can I avoid that error but still log the loss?
Side note:
After trainer.train()
, I save the model and the tokenizer. Apparently, this uses the GPU and may cause OutOfMemoryError: CUDA out of memory.
You can just comment out these lines (or pay Colab for more GPU memory).