Chapter 3 questions

QuantiPhy · August 27, 2023, 2:54pm

In the Finetuning chapter, the article on Full-training isn’t available on Tensorflow.

zongxiao · September 1, 2023, 12:59am

Try it out! Replicate the preprocessing on the GLUE SST-2 dataset. It’s a little bit different since it’s composed of single sentences instead of pairs, but the rest of what we did should look the same. For a harder challenge, try to write a preprocessing function that works on any of the GLUE tasks.
How many types do the GLUE tasks have? single sentences/ pairs/three?

zongxiao · September 1, 2023, 1:48am

from transformers import TrainingArguments

training_args = TrainingArguments(“test-trainer”)

ImportError Traceback (most recent call last)
in <cell line: 3>()
1 from transformers import TrainingArguments
2
----> 3 training_args = TrainingArguments(“test-trainer”)

4 frames
/usr/local/lib/python3.10/dist-packages/transformers/training_args.py in _setup_devices(self)
1770 if not is_sagemaker_mp_enabled():
1771 if not is_accelerate_available(min_version = “0.20.1”):
→ 1772 raise ImportError(
1773 “Using the Trainer with PyTorch requires accelerate>=0.20.1: Please run pip install transformers[torch] or pip install accelerate -U”
1774 )

ImportError: Using the Trainer with PyTorch requires accelerate>=0.20.1: Please run pip install transformers[torch] or pip install accelerate -U

NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
“Open Examples” button below.

anuragrawal · October 6, 2023, 2:47pm

Hi All,

I want to fine tune a summarization model on a custom dataset. Are there any guidelines around how much data I would need, will data from a different domain help, etc.?

I am trying to summarize conversations. In most cases, these conversations will involve just two people. I finetuned google/flan-t5-base and facebook/bart-large-cnn on about 1000 examples, results are good but not as good as GPT-3.5.

Do I need to gather and train on more data? If I don’t have access to data for my use case, can I use data from any other domain as long as they are conversations? Say, from podcasts?

For how long do I train the model for? Are there any best practices around choosing number of epochs, etc.?

I am looking to improve the performance of my model and can really use some help! I have looked online but can’t find a clear answer. I understand that in a lot of cases, you need to experiment what works for you but there are so many possibilities and I am looking for a starting point, as a beginner in this field.

Thank you for your help!

Davin23 · October 10, 2023, 3:39pm

When i run the trainer.train(), it comes with the following error:
TypeError: ‘NoneType’ object is not callable.

lokeshm · October 12, 2023, 1:37am

The attention_masks returned by tokenize_function will be equal to the sequence length passed. Also, it will all be 1’s. DataCollatorWithPadding will add 0’s to the attention_mask based on the longest sequence in the batch.

based on the example in the chapter.

FelipeC · October 24, 2023, 3:05pm

I am working on the last ‘Try out!’ in Chapter 3 section ‘Fine-tuning a model with the Trainer API’. Everything goes fine until the code line ‘trainer.train()’ as it is showed in the image.

Please help me to solve this error.

buscon · November 2, 2023, 2:06pm

In Chapter 3, section '‘Fine-tuning a model with the Trainer API’. ’ when I run into the following error when I istantiate the TrainingArguments Class.
The accelerate module is already installed in the requested version.

The issue happens in the linked colab exercise notebook.

Any idea how to fix it?


---------------------------------------------------------------------------

ImportError                               Traceback (most recent call last)

<ipython-input-3-11170ce17e38> in <cell line: 3>()
      1 from transformers import TrainingArguments
      2 
----> 3 training_args = TrainingArguments("test-trainer")

4 frames

/usr/local/lib/python3.10/dist-packages/transformers/training_args.py in _setup_devices(self)
   1799         if not is_sagemaker_mp_enabled():
   1800             if not is_accelerate_available(min_version="0.20.1"):
-> 1801                 raise ImportError(
   1802                     "Using the `Trainer` with `PyTorch` requires `accelerate>=0.20.1`: Please run `pip install transformers[torch]` or `pip install accelerate -U`"
   1803                 )

ImportError: Using the `Trainer` with `PyTorch` requires `accelerate>=0.20.1`: Please run `pip install transformers[torch]` or `pip install accelerate -U`


---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.

Dylanclli · November 7, 2023, 7:37am

Have you finish this?

Diegulio · November 14, 2023, 11:48pm

Hi! I am not understanding why we should tokenize the dataset using map and also using it as the trainer argument tokenizer. What is the behavior of the tokenizer argument in the Trainer?

jbao8899 · November 20, 2023, 6:47pm

I am trying to do the transformers course, and am running into trouble in the lesson “Fine-tuning a model with the Trainer API.” I am running it on a free Google Colab instance with a T4 GPU. All of the provided code works (I had to add a !pip install transformers torch after the !pip install datasets evaluate transformers[sentencepiece] to make it work), but at the bottom, we were told to " Fine-tune a model on the GLUE SST-2 dataset, using the data processing you did in section 2." Here, I ran into a strange error.

Below is the code I was trying to use for this exercise. Each code block is put into its own block here.

single_sentence_dataset = load_dataset("glue", "sst2")
single_sentence_dataset

single_sentence_dataset['train'].features

single_sentence_dataset['train'][0]

checkpoint = "bert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)

def tokenize_single_sentence_function(example):
    return tokenizer(example["sentence"], truncation=True)


tokenized_single_sentence_datasets = single_sentence_dataset.map(tokenize_single_sentence_function, batched=True)
single_sentence_data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

tokenized_single_sentence_datasets['train'][0]

single_sentence_training_args = TrainingArguments("sst2-trainer", evaluation_strategy="epoch")

from transformers import AutoModelForSequenceClassification

single_sentence_model = AutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=2)

def compute_metrics_single_sentence(eval_preds):
    metric = evaluate.load("glue", "sst2")
    logits, labels = eval_preds
    predictions = np.argmax(logits, axis=-1)
    return metric.compute(predictions=predictions, references=labels)

trainer = Trainer(
    single_sentence_model,
    single_sentence_training_args,
    train_dataset=tokenized_single_sentence_datasets["train"],
    eval_dataset=tokenized_single_sentence_datasets["validation"],
    data_collator=single_sentence_data_collator,
    tokenizer=tokenize_single_sentence_function,
    compute_metrics=compute_metrics_single_sentence
)

trainer.train()

Everything runs, except for trainer.train(). When I call that, it gets to [ 501/25257 00:46 < 38:17, 10.77 it/s, Epoch 0.06/3] (what is the 25257 from, anyways? There are 67349 training examples here) before crashing with the following error:

/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in _maybe_log_save_evaluate(self, tr_loss, model, trial, epoch, ignore_keys_for_eval)
   2280 
   2281         if self.control.should_save:
-> 2282             self._save_checkpoint(model, trial, metrics=metrics)
   2283             self.control = self.callback_handler.on_save(self.args, self.state, self.control)
   2284 

/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in _save_checkpoint(self, model, trial, metrics)
   2348         run_dir = self._get_output_dir(trial=trial)
   2349         output_dir = os.path.join(run_dir, checkpoint_folder)
-> 2350         self.save_model(output_dir, _internal_call=True)
   2351         if self.is_deepspeed_enabled:
   2352             # under zero3 model file itself doesn't get saved since it's bogus! Unless deepspeed

/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in save_model(self, output_dir, _internal_call)
   2841 
   2842         elif self.args.should_save:
-> 2843             self._save(output_dir)
   2844 
   2845         # Push to the Hub when `save_model` is called by the user.

/usr/local/lib/python3.10/dist-packages/transformers/trainer.py in _save(self, output_dir, state_dict)
   2904 
   2905         if self.tokenizer is not None:
-> 2906             self.tokenizer.save_pretrained(output_dir)
   2907 
   2908         # Good practice: save your training arguments together with the trained model

AttributeError: 'function' object has no attribute 'save_pretrained'

Removing the compute_metrics argument does not change anything.

Does anyone know what is going on? I am not explicitly telling it to save anything. Why is this failing when the provided code works?

Thank you!

Edit:

I was able to train a model for SST2 successfully without using the trainer API in the next lesson. Why does it not work here?

jbao8899 · November 22, 2023, 4:11pm

I fixed the issue. You need to pass the tokenizer, and not the tokenizer function, into Trainer. Also, I had to set save_strategy="steps" and save_steps=0.25 in the TrainingArguments to prevent the training process from filling up my Google Colab’s disk by storing weights every 500 steps.

jwschrader · November 30, 2023, 11:27pm

Hi folks,
When running the pytorch version of the “Fine-Tuning a model notebook”(link)[Google Colab], I’m getting this error:

"Using the `Trainer` with `PyTorch` requires `accelerate>=0.20.1`: Please run `pip install transformers[torch]` or `pip install accelerate -U`"

I tried pip installing as the error suggests but no luck. Any ideas would be appreciated!

Awesome courses, btw! Thanks!

Garydeu · December 11, 2023, 1:30pm

hey, sorry i have one question related to this cold
“training_args = TrainingArguments(“test-trainer”)”
when i copy these code and run it in my jupyter notebook it gives me the error that i need to install either transformer or accelerate

after i succeeded install those two but the import error still exists…
Could someone do me a favor and explain to me how i can solve this problem?
Thank you

TwentyNine · December 13, 2023, 3:06pm

I am also getting the ImportError that other users are seeing for the PyTorch version of the “Fine Tuning” Google Colab book.

MatthiasAppelGip · January 3, 2024, 4:28pm

I had the same problem in colab notebook and local in PyCharm
It worked when I execute the following instructions in the colab notebook.
!pip install accelerate
Update: After changing the runtime to T4 in the colab notebook the error appears again. I changed to T4 because train was very slow.

suhaaspk · February 4, 2024, 7:25pm

Everyone who had the following error:

"Using the `Trainer` with `PyTorch` requires `accelerate>=0.20.1`: Please run `pip install transformers[torch]` or `pip install accelerate -U`"

I had the same error and I solved it by restarting the runtime in colab and then running

! pip install -U accelerate
! pip install -U transformers

which upgrades the accelerate and transformers library to their latest versions. Hopefully this works for others.

AltShift · February 7, 2024, 10:13am

I had to uninstall them, first, then reinstall before it worked.

sztal · February 12, 2024, 4:11pm

Hi,

I am trying to run the code from the first chunk in the chapter, but I keep getting an error from load_dataset(). Namely, it happens when I do:

from datasets import load_dataset
from transformers import AutoTokenizer, DataCollatorWithPadding

raw_datasets = load_dataset("glue", "mrpc")

And the error is:
TypeError: expected str, bytes or os.PathLike object, not NoneType

I use datasets=2.12.0 and transformers=4.37.2.

Any idea for what may be the cause of this?

sztal · February 12, 2024, 4:35pm

Ok, I set up a Conda environment from scratch enforcing datasets>=2.17.0 and now it works. So, I guess, the problem was due to an old version of datasets?

Topic		Replies	Views
Implementation source code for AutoModelForSeq2SeqLM Beginners	0	899	January 5, 2022
BART from finetuned BERT Intermediate	2	443	September 9, 2021
How create BERT2Rand Encoder-Decoder model Models	2	1053	March 16, 2021
VisionEncoderDecoder/TrOCR Models	0	667	October 21, 2021
Can we use a random state Bert model in BertGeneration? 🤗Transformers	0	362	June 14, 2023

Chapter 3 questions

Related Topics