stderr: WARNING:root:Unsupported nprocs (2), ignoring...
It seems setting number of processes during accelerate config
> 1 causes a cascade of errors. Here’s the config I’m using. num_processes=1
works without problem:
compute_environment: LOCAL_MACHINE
distributed_type: TPU
downcast_bf16: 'no'
machine_rank: 0
main_training_function: main
mixed_precision: 'no'
num_machines: 1
num_processes: 2
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
To reproduce, simply install accelerate
and accelerate test
with this config file on a Kaggle TPUVM 3-8.
Does that mean XLA won’t be able to use all 8 cores, or is there something which I’m missing here?