How to use Multiple GPUs in parallel in fine-tuning cross encoder model

Hello,

I have a question. I have more than one GPU available at my end. I am fine-tuning a cross-encoder model. I would like to know what modifications I need to make in order to enable the following function utilise all the gpus. And I need the model to be saved as is the case at the moment.

# Function to train the model
def train_model(model_name, train_samples, test_samples, model_short_name, train_batch_size=16, num_epochs=4, warmup_pct=0.1):
    print('Starting: {}'.format(model_name))

    # Make sure the model gets saved, and doesn't overwrite any previously trained models
    model_save_path = f'models_RetrievalTrained/crossencoder_{model_name.replace("/", "-")}_{entity_type.lower()}_top{n}_{model_short_name}_{datetime.now().strftime("%Y-%m-%d_%H-%M-%S")}'
    train_dataloader = DataLoader(train_samples, shuffle=True, batch_size=train_batch_size)

    warmup_steps = math.ceil(len(train_dataloader) * num_epochs * warmup_pct)

    evaluator_test = CECorrelationEvaluator.from_input_examples(test_samples, name='test_sample')

    # This is just a wrapper on a Hugging Face model object.
    model = CrossEncoder(model_name, num_labels=1)

    print('Begin Training: {}'.format(model_name))
    model.fit(train_dataloader=train_dataloader,
              evaluator=evaluator_test,
              epochs=num_epochs,
              warmup_steps=warmup_steps,
              output_path=model_save_path)

    print('Evaluations: {}'.format(model_name))
    # These don't matter too much because we're ultimately going to evaluate using recall.
    evaluator_test = CECorrelationEvaluator.from_input_examples(test_samples, name='test_sample')
    print('Test: {}'.format(evaluator_test(model)))
    evaluator_train = CECorrelationEvaluator.from_input_examples(train_samples, name='train_sample')
    print('Train: {}'.format(evaluator_train(model)))

    # Clean up so we don't leave stuff on the GPU
    del model
    gc.collect()
    torch.cuda.empty_cache()

    return model_save_path

I have multiple GPUs but at the moment the code uses only one, how do I make use of all the GPUs available.
1 Like