Hi, i was trying to finetune mistral 7B model using a custom dataset. I was just wondering if setting a too large max_seq_length will affect the training and lead to a bad fine-tuned model performance.
To my understanding, the seq_length should be the tokenized text length, right? So too big a sequence length might lead too many paddings, but i wonder whether it will affect fine-tuning except for more memory usage.
Many thanks!