Huggingface on Databricks

slowcheetah · November 12, 2021, 3:31am

I have a multiple GPU cluster (driver + 2x worker; each with 2 GPUs i.e. a total of 6 GPUs) setup on Databricks. I want to have distributed training and inference running on this cluster. Using the distributed modules, I am able to leverage only the GPUs on the driver node.

Is there any way I can make use of all the 6 GPUs (I don’t have terminal access to the cluster)?

Thanks in advance.

Topic		Replies	Views
Distributed training on different gpus Beginners	0	221	August 30, 2023
Distributed Training on Databricks 🤗Transformers	0	899	November 14, 2020
Multiple gpu training 🤗Transformers	1	2535	August 10, 2024
Multi gpu training 🤗Transformers	3	6017	April 24, 2022
Default distributed strategy used in single-node multi-GPU env 🤗Transformers	0	120	September 12, 2023

Huggingface on Databricks

Related topics