Hi guys,
I am trying to fine-tune GPTSAN model for summarisation task as explained in a blog post of huggingface. Please look here. The modifications I made are
- changing model
- re-define preprocessing step.
def preprocess_function(examples):
model_inputs = tokenizer(examples['text'], truncation=True, max_length=1024)
labels = tokenizer(examples['summary'], truncation=True, max_length=128)
model_inputs['labels'] = labels['input_ids']
return model_inputs}
- I used standard training loop instead of
trainer
.
for batch in train_dataloader:
optimzier.zero_grad()
batch = [k: v.to(device) for k, v in batch.items()]
pred = model(input_ids = batch['input_ids'],
attention_mask = batch['attention_mask'],
token_type_ids = batch['token_type_ids'],
decoder_inputs_embeds = batch['decoder_input_ids'],
labels = batch['label'])
loss = pred.loss
loss.backward()
optimizer.step()
scheduler.step()
Then I got an error message saying that the shape of input_ids
is not the same as that of the labels
.
ValueError: Expected input batch_size (1024) to match target batch_size(128)
.
Setting those shapes equal in the preprocess function solve the error. But I don’t know it is theoretically true or wrong.
So, I want to know that it is true?
Thank you for your kind suggestion.