How to use Multiple GPUs in parallel in fine-tuning cross encoder model

RW04 · January 23, 2024, 1:47pm

Hello,

I have a question. I have more than one GPU available at my end. I am fine-tuning a cross-encoder model. I would like to know what modifications I need to make in order to enable the following function utilise all the gpus. And I need the model to be saved as is the case at the moment.

# Function to train the model
def train_model(model_name, train_samples, test_samples, model_short_name, train_batch_size=16, num_epochs=4, warmup_pct=0.1):
    print('Starting: {}'.format(model_name))

    # Make sure the model gets saved, and doesn't overwrite any previously trained models
    model_save_path = f'models_RetrievalTrained/crossencoder_{model_name.replace("/", "-")}_{entity_type.lower()}_top{n}_{model_short_name}_{datetime.now().strftime("%Y-%m-%d_%H-%M-%S")}'
    train_dataloader = DataLoader(train_samples, shuffle=True, batch_size=train_batch_size)

    warmup_steps = math.ceil(len(train_dataloader) * num_epochs * warmup_pct)

    evaluator_test = CECorrelationEvaluator.from_input_examples(test_samples, name='test_sample')

    # This is just a wrapper on a Hugging Face model object.
    model = CrossEncoder(model_name, num_labels=1)

    print('Begin Training: {}'.format(model_name))
    model.fit(train_dataloader=train_dataloader,
              evaluator=evaluator_test,
              epochs=num_epochs,
              warmup_steps=warmup_steps,
              output_path=model_save_path)

    print('Evaluations: {}'.format(model_name))
    # These don't matter too much because we're ultimately going to evaluate using recall.
    evaluator_test = CECorrelationEvaluator.from_input_examples(test_samples, name='test_sample')
    print('Test: {}'.format(evaluator_test(model)))
    evaluator_train = CECorrelationEvaluator.from_input_examples(train_samples, name='train_sample')
    print('Train: {}'.format(evaluator_train(model)))

    # Clean up so we don't leave stuff on the GPU
    del model
    gc.collect()
    torch.cuda.empty_cache()

    return model_save_path

I have multiple GPUs but at the moment the code uses only one, how do I make use of all the GPUs available.

Zeyu1111 · February 25, 2025, 1:35am

I also facing the same problem, are you able to make it works on multiple gpus now?

Topic		Replies	Views
How to train my model on multiple GPU 🤗Transformers	2	1925	March 6, 2024
Training using multiple GPUs Beginners	20	20027	February 25, 2024
How to use specified GPUs with Accelerator to train the model? Beginners	15	29065	August 23, 2024
Trainer.evalute() with multi GPUs results Expected all tensors to be on the same device, but found at least two devices, cuda:3 and cuda:0! Beginners	2	63	February 11, 2025
Can't use DistributedDataParallel for training the EncoderDecoderModel 🤗Transformers	2	5469	October 27, 2020

How to use Multiple GPUs in parallel in fine-tuning cross encoder model

Related topics