Logging & Experiment tracking with W&B

For people interested in tools for logging and comparing different models and training runs in general, Weights & Biases is directly integrated with :hugs: Transformers.

You just need to have wandb installed and logged in.

It automatically logs losses, metrics, learning rate, computer ressources, etc.

Here is another cool example where I ran a sweep to fine-tune GPT-2 on tweets.

Finally you can use your runs to create cool reports. See for example my huggingtweets report.

See documentation for more details.

At the moment it is integrated with Trainer and TFTrainer.

If you use Pytorch Lightning, you can use WandbLogger. See Pytorch Lightning documentation.

Let me know if you have any questions or ideas to make it better!


@boris I have a few questions for the HF Transformers integration:

  1. It looks like wandb is charting the loss, learning rate, and epoch for a given run of Trainer.train(). Are there other things that would be useful to have charted for a finetuning run?

  2. It also looks like wandb is using the logging_steps value in TrainerArguments. Is this right?

  3. Is it preferred to set wandb behavior through the environment variables or in the finetuning script directly?

1 Like

Hi @aclifton314,

It also logs validation loss and all the task dependent metrics defined against your validation dataset.
Then you can decide to log losses/metrics against epoch using it as x-axis in W&B interface (instead of the step) if it makes more sense for your use case.

That is correct. It can also log evaluation loss/metrics at the end of training if the evaluate loop is called at the end of your script (usually the case when you use one of the “examples” script from the library).

There is no preferred way. When I write my own fine-tuning script, I like to pass explicitly my variables as the code looks more clear to me but either way should work perfectly fine.

@boris. Thanks for your reply! One more quick question. I see where wandb gets initialized in the HF integration. Suppose I want to also log some other metric or value that isn’t automatically logged in the integration. Is there a way to call the same wandb object from my script that the HF inmplementation is using to include this value being logged?

Yes, you can just call wandb.log anytime manually like so
wandb.log({"my_metric":1.2}, step=trainer.global_step)
Passing the step is not mandatory but preferred when logging at different places.

1 Like

@sgugger Can you make top post a wiki to update docs link and add relevant updates when needed?

It’s done!

1 Like

I’m thinking that we could add a way to log & track datasets and trained models.
We could either do it with an environment variable or another parameter in TrainingArguments.

@boris How did you manage to create such a sweep? Have you used Google Colab or any local machine? (I’m currently struggling with setting up one in Colab)

In the docs it says that there are troubles by using wandb.agent with GPU-Support

@katharinafluch There are many ways to run sweeps but I actually ran mine in the console.
There is a new version of wandb coming up soon that will better support sweeps in Colab so I can make a demo for it then!

@boris I have export WANDB_MODE='dry_run' and WANDB_WATCH='all' setup in my environment. I ran a training session and synced the dryrun using wandb local. When I view the results on the localhost, I don’t see any information about the parameters. Just plots about learning rate, epoch, and loss.

Do you know where I can view this information about the parameters and/or if I have done something wrong to prevent them being recorded?

Is it a tensorflow model? I believe watch works mainly for Pytorch models.
Also watch does not work with TPU.

It is a pytorch model and I am running on a single gpu.

Ok then maybe it’s because you have less than 100 training steps. Watch logs every 100 steps by default. Try training for more epochs.

I checked the TrainArguments object and it was set to 12,000 steps.

1 Like

@boris, should the information about the parameters appear alongside the learning rate, epoch, and loss or are they somewhere else in the wandb dashboard?

@aclifton314 It appears in separate sections.

By default, you get gradients logged under “gradients” (as long as you have more than 100 training steps).
You can also log parameters by setting WANDB_WATCH to 'all` and you will get both parameters and gradients (see documentation).

I made a demo colab which also logs both gradients and parameters.

Feel free if you have any other questions!

1 Like

@boris oh awesome! Thank you for pointing that out. I had totally missed it.

I am not seeing the gradients section, though. Is there something I need to modify in my setup?

1 Like

Do they appear when you use my colab?
It worked on my run

You just need to have at least 100 training steps.

I have export WANDB_WATCH='all' in my .bashrc. Do you think the formatting of that needs to be changed?