Accelerator not performing multi-gpu train in jupyter

anon51616952 · May 28, 2023, 1:06am

I am trying to train the TimeSeries Transformer on multiple gpus from a jupyter notebook using the accelerator, but it always uses just one gpu. I’ve looked at and followed the available examples online, so I’m not sure what is wrong with the code (see below). Thanks in advance for any help.

accelerator = Accelerator()
device = accelerator.device

optimizer = AdamW(model.parameters(), lr=6e-4, betas=(0.9, 0.95), weight_decay=1e-1)

model, optimizer, train_dataloader = accelerator.prepare(model, optimizer, train_dataloader,)

model.train()
for epoch in range(num_epochs):
for idx, batch in enumerate(train_dataloader):
optimizer.zero_grad()
outputs = model(
static_categorical_features=batch[“static_categorical_features”].to(device)
if config.num_static_categorical_features > 0
else None,
static_real_features=batch[“static_real_features”].to(device)
if config.num_static_real_features > 0
else None,
past_time_features=batch[“past_time_features”].to(device),
past_values=batch[“past_values”].to(device),
future_time_features=batch[“future_time_features”].to(device),
future_values=batch[“future_values”].to(device),
past_observed_mask=batch[“past_observed_mask”].to(device),
future_observed_mask=batch[“future_observed_mask”].to(device),
)
loss = outputs.loss

    # Backpropagation
    accelerator.backward(loss)
    optimizer.step()

    if idx % 100 == 0:
        print(loss.item())

muellerzr · May 28, 2023, 1:42am

You need to ensure youre using the jupyter launcher:

Topic		Replies	Views
Multi node CPU to train transformer GPT-JT-6B-v1 (moved) 🤗Transformers	0	433	February 20, 2023
Notebook_launcher set num_processes=2 but it say Launching training on one GPU. in Kaggle 🤗Accelerate	6	1949	December 10, 2022
What does "--multi_gpu" do under the hood? (and how to use it) 🤗Accelerate	7	6805	May 31, 2023
Missing positional arguments when try to use multiple GPUs with accelerator 🤗Accelerate	4	2077	May 11, 2021
Struggle with training on TPU using 'accelerate' library 🤗Accelerate	3	1723	March 7, 2022

Accelerator not performing multi-gpu train in jupyter

Related topics