So I’m about to try and replicate a QLoRA notebook, hopefully fine tune a task in 4bit Peft. The idea behind it is to train a very large LM (billions of parameters in size) on a single GPU. As the explanation finished and the narrator talks through the libraries needed he says ‘accelerate’ will be used.
ACCELERATE - I though this was to train across multiple GPU’s? But if the intention is to train on a local and single GPU, why do we need accelerate? Anyone have a quick 5 mins just to set my mind straight and in the right direction??
Thanks! Again, no problem, it would just be nice to hear from someone other than Chat GPT for my answer…