When in a distributed setting, should the trackers (for instance accelerate.tracking.TensorBoardTracker) be defined only by the main process (i.e. wrapped inside a accelerator.is_main_process condition) or all the processes?
For reference, wandb allows both with slightly different outputs: Distributed Training - Documentation
             
            
              
              
              
            
            
           
          
            
            
              
Accelerate only supports on the main process  If there is a need or desire to do logging across all of them, we can support that. But all of the logging functionalities are purposefully limited to just the main process.
 If there is a need or desire to do logging across all of them, we can support that. But all of the logging functionalities are purposefully limited to just the main process.
Also you donât need to do if accelerator.is_main_process for init specifically if you are building off main. (This will be propagated to the next release)
             
            
              
              
              2 Likes