How to run Pytorch, huggingface pretrained DeBerta in jupyter notebook? Setup: Win11, RTX3070

My desktop is running on Win11, and RTX 3070. Now I have a NLP task which uses model_content = AutoModelForSequenceClassification.from_pretrained(self.model_path, config=self.model_config) so I would love to leverage my GPU in the machine.

I pip installed torch with this:

pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html

Everything is successful.

print(torch.__version__)
print(torch.cuda.get_arch_list())
# Check if CUDA is available
if torch.cuda.is_available():
    print("CUDA is available.")
    print("CUDA Version:", torch.version.cuda)
    print("Number of GPUs:", torch.cuda.device_count())
    print("Current CUDA Device Index:", torch.cuda.current_device())
    print("Current CUDA Device Name:", torch.cuda.get_device_name(torch.cuda.current_device()))
else:
    print("CUDA is not available.")

it returns:

1.9.0+cu111
['sm_37', 'sm_50', 'sm_60', 'sm_61', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'compute_37']

CUDA is available.
CUDA Version: 11.1
Number of GPUs: 1
Current CUDA Device Index: 0
Current CUDA Device Name: NVIDIA GeForce RTX 3070 Laptop GPU

However when I run my script, it throws this error:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
c:\SynologyDrive\Python\nlp\commonlit2\deberta\large\debertav3-baseline-origin - Copy.ipynb Cell 14 line 2
      1 for target in ["content", "wording"]:
----> 2     train_by_fold(
      3         train,
      4         model_name=CFG.model_name,
      5         save_each_model=False,
      6         target=target,
      7         learning_rate=CFG.learning_rate,
      8         hidden_dropout_prob=CFG.hidden_dropout_prob,
      9         attention_probs_dropout_prob=CFG.attention_probs_dropout_prob,
     10         weight_decay=CFG.weight_decay,
     11         num_train_epochs=CFG.num_train_epochs,
     12         n_splits=CFG.n_splits,
     13         batch_size=CFG.batch_size,
     14         save_steps=CFG.save_steps,
     15         max_length=CFG.max_length
     16     )
     19     train = validate(
     20         train,
     21         target=target,
   (...)
     26         max_length=CFG.max_length
     27     )
...
   1769         )
   1770     AcceleratorState._reset_state(reset_partial_state=True)
   1771 self.distributed_state = None

ImportError: Using the `Trainer` with `PyTorch` requires `accelerate>=0.20.1`: Please run `pip install transformers[torch]` or `pip install accelerate -U`

I then ofcourse installed the suggested “accelerate” library. First of all, it failed, and then the worst of all, it deleted my torch 1.9.0+cu111, and installed torch 2.1.0. No matter which command I used, pip install transformers[torch] or pip install accelerate -U, it erased my stable torch 1.9.0+cu111 in the end.

The error message also said “accelerate” needs at least torch 1.10. Then I tried to move up one step, and get the cuda version. But unfortunately there is no. There are lots of other combination, but seems nothing works for my 3070.

Does anyone know how to fix this? make the huggingface pretrained deBerta model train with RTX 3070 with CUDA on?

I have also tried to set deepspeed=None, but no use.

training_args = TrainingArguments(
            output_dir=model_fold_dir,
            load_best_model_at_end=True, # select best model
            learning_rate=learning_rate,
            per_device_train_batch_size=batch_size,
            per_device_eval_batch_size=8,
            num_train_epochs=num_train_epochs,
            weight_decay=weight_decay,
            report_to='none',
            greater_is_better=False,
            save_strategy="steps",
            evaluation_strategy="steps",
            eval_steps=save_steps,
            save_steps=save_steps,
            metric_for_best_model="rmse",
            save_total_limit=1,
            deepspeed=None  # Add this line to remove the accelerate dependency
        )

Many thanks in advance.

Did you try restarting the notebook after updating the libraries? Let me know if it runs after restarting.

Yes, I have, but it doesn’t work.

Problem solved thank you! :grinning:

That’s great :innocent:. Could you please explain how you solved the issue for the sake of fellow members? If it is through restarting kernel, please mark as solution.

If i’m not mistaken, this is for the kaggle commonlit competetion right? Could you post the score once you submit?