Starcoder: CUDA out of memory

Hi!
I searched the internet and it seems that accelerator.device always uses cuda:0, so the model is not distributed to the GPUs.

Maybe this helps you.