GPTSAN-japanese Summarisation

minkhantycc · May 8, 2024, 5:58am

Hi guys,
I am trying to fine-tune GPTSAN model for summarisation task as explained in a blog post of huggingface. Please look here. The modifications I made are

changing model
re-define preprocessing step.

def preprocess_function(examples):
                    model_inputs  = tokenizer(examples['text'], truncation=True, max_length=1024)
                    labels = tokenizer(examples['summary'], truncation=True, max_length=128)
                    model_inputs['labels'] = labels['input_ids']
                    return model_inputs}

I used standard training loop instead of trainer.

for batch in train_dataloader:
      optimzier.zero_grad()
      batch = [k: v.to(device) for k, v in batch.items()]
      pred = model(input_ids = batch['input_ids'],
                                   attention_mask = batch['attention_mask'],
                                   token_type_ids = batch['token_type_ids'],
                                   decoder_inputs_embeds = batch['decoder_input_ids'],
                                   labels = batch['label'])
       loss = pred.loss
       loss.backward()
       optimizer.step()
       scheduler.step()

Then I got an error message saying that the shape of input_ids is not the same as that of the labels.
ValueError: Expected input batch_size (1024) to match target batch_size(128).

Setting those shapes equal in the preprocess function solve the error. But I don’t know it is theoretically true or wrong.
So, I want to know that it is true?

Thank you for your kind suggestion.

Topic		Replies	Views
Do we really preprocess the entire data set with hugging face even when we train very large language models e.g. gpt-3 size? Beginners	4	671	August 12, 2022
Labels shape when using model.fit and TFGPT2LMHeadModel 🤗Transformers	0	752	February 1, 2021
How to catch Up with the GPT2 based model. at each iteration the size of the model increases 🤗Transformers	0	290	June 26, 2023
ValueError: Expected input batch_size (8) to match target batch_size (280) Beginners	1	1894	November 18, 2024
ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length. Perhaps your features (`label` in this case) have excessive nesting (inputs typ Beginners	3	891	March 4, 2024

GPTSAN-japanese Summarisation

Related topics