Fine tune with different max_length

drewaight · June 15, 2022, 7:33pm

This is probably a stupid question, but I cant find the answer anywhere.

Can I fine tune with a longer max_length that what was trained with the initial model. My initial LM model was trained with max_length=150 but I want to fine tune a sequence for classification with max_length=300. Is it possible or should I retrain with the longer length.

I will also ask ( I am training on protein sequences ). My model was trained on a mixture of individual sequences of A and B, but I want to fine tune on concatenated sequences of AB, (hence the longer length) is this folly or acceptable? Thanks for any insight!

Drew

Slinae · June 16, 2022, 1:40am

Hi, drewaight,

Could you please give more details? I am not sure what you mean about “trained on a mixture of individual sequences of A and B”? What is the difference between A and B?

I think it would be better to retrain the model if it needs to modify the max length.

drewaight · June 16, 2022, 3:56am

Absolutely, the model is trained on ~60million unpaired heavy and light chain (~150 character) sequences from the OAS. The model learns representations of these protein sequences (heavy and light sequences).

…only the functional unit (antibody… that we have data on) is a heterodimer of one light chain sequence and one heavy chain sequence. Essentially any heavy chain and any light chain can pair.

I was thinking if I retrain the model with max_length = 300, would it make any sense to fine-tune (paired data) with light-heavy concatenated sequences. Would the LM model even recognize that its just two concatenated shorter “sentences” that its been trained on or would it be nonsense because it always expects padded region 150-300 to be nothing.

Good thing this is the beginners forum. Thanks for your help!

Drew

paper for inspiration

Topic		Replies	Views
Does setting max_seq_length to a too large number for fine tuning LLM using SFTTrainer affects model training? Beginners	1	1884	December 6, 2024
Optimizing LLM Training with Variable Sequence Lengths: Impact on Model Performance Beginners	0	100	July 16, 2024
Issue with batching long sequences Beginners	0	7	July 16, 2024
Error using `max_length` in transformers 🤗Transformers	3	2703	February 26, 2021
T5 tokenizer / ideal method of calculating max_sequence_length? 🤗Transformers	1	542	May 22, 2024

Fine tune with different max_length

Related topics