Wandb.watch in accelerate library

When I’m nominally using wandb, in there documentation for pytorch integration there is a suggested call to the watch method:

wandb.watch(my_model, log='all', log_freq=8)

With the following enviroment variable set:

export WANDB_WATCH="all"

I’m able to record the values of the gradients and the parameter values throughout training. Is there a way to record the gradients and parameter values throughout training by using the accelerate library?

Did you try calling wandb.watch() just after initialising the wandb run in Acclerate? It might still work as long as its called before you start logging in your script

1 Like

For code of what @morgan means, it’d look something like this:

accelerate = Accelerator(log_with="wandb")

accelerate.init_trackers("my_projectname")
wandb.watch()
1 Like

@morgan @muellerzr That works perfectly! I tested it out and you can call wandb.watch() even right before the training loop starts (before one starts logging metrics to wandb).

1 Like

One last thing. I noticed that if I run accelerator on multiple gpu (say 3) and choose my tracker to be wandb that it will output 3 files to sync to wandb. Is there a way to some how aggregate all those files into one to get a single view of the entire training loop?

1 Like

@aclifton314 we’re aware of this and will be working on it, for now you could pass a group argument to the wandb init kwargs, then you can group by this value in the UI. Group Runs - Documentation

2 Likes