Tensor types mismatch when trying to enable GPU

nastassja-bellisario · June 16, 2023, 3:58pm

Hello,

I need help figuring out what I’m missing from this tutorial about Whisper fine-tuning.

I’m working in a containerized Docker environment with the following specifications:

Base image: ubuntu:22.04
Python 3.10
Pipenv to manage dependencies and Python environment

I’m using the Italian subset of the original dataset common_voice and working with just 1% of the subset’s data. Here’s the code snippet:

common_voice["train"] = load_dataset("mozilla-foundation/common_voice_11_0", "it", split="train[:1%]+validation[:1%]", use_auth_token=True)
common_voice["test"] = load_dataset("mozilla-foundation/common_voice_11_0", "it", split="test[:1%]", use_auth_token=True)

For now, my objective is to implement the entire tutorial. I’m focusing on the implementation of the entire application rather than the quality of the final training. I have successfully implemented all steps regarding dataset loading and data preparation. However, when it is time to run the trainer.train method, the process takes too long, and the progress bar remains at 0% for hours, despite having a relatively small dataset. Here is the code for my trainer configuration:

# -------- #
# training #
# -------- #

print("Configuring training parameters...")
training_args = Seq2SeqTrainingArguments(
    optim="adamw_torch",
    output_dir="./whisper-small-it",  # Change to a repo name of your choice
    per_device_train_batch_size=16,
    gradient_accumulation_steps=1,  # Increase by 2x for every 2x decrease in batch size
    learning_rate=1e-5,
    warmup_steps=500,
    max_steps=4000,
    gradient_checkpointing=True,
    fp16=False,  # FP16 half precision evaluation (`--fp16_full_eval`) can only be used on CUDA devices
    evaluation_strategy="steps",
    per_device_eval_batch_size=8,
    predict_with_generate=True,
    generation_max_length=225,
    save_steps=1000,
    eval_steps=1000,
    logging_steps=25,
    report_to=["tensorboard"],
    load_best_model_at_end=True,
    metric_for_best_model="wer",
    greater_is_better=False,
    push_to_hub=True,
)

trainer = Seq2SeqTrainer(
    args=training_args,
    model=model,
    train_dataset=common_voice["train"],
    eval_dataset=common_voice["test"],
    data_collator=data_collator.DataCollatorSpeechSeq2SeqWithPadding(processor=processor),
    compute_metrics=compute_metrics,
    tokenizer=processor.feature_extractor,
)

trainer.train()

Due to the slow training process, I decided to move outside my container and run the application on my MacBook Pro (Apple M2 Max) to utilize my GPU. I configured Accelerate using the following command:

python -c "from accelerate.utils import write_basic_config; write_basic_config(mixed_precision='fp16')"

I also edited my training arguments to enable fp16=True. However, when running the application on my Mac with this setup, I encountered the following error at the trainer.train method:

Traceback (most recent call last):
  File "/macos-local/main.py", line 126, in <module>
    trainer.train()
  File "/.local/share/virtualenvs/macos-local-Q10oLr9O/lib/python3.11/site-packages/transformers

/trainer.py", line 1645, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/.local/share/virtualenvs/macos-local-Q10oLr9O/lib/python3.11/site-packages/transformers/trainer.py", line 1938, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.local/share/virtualenvs/macos-local-Q10oLr9O/lib/python3.11/site-packages/transformers/trainer.py", line 2759, in training_step
    loss = self.compute_loss(model, inputs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.local/share/virtualenvs/macos-local-Q10oLr9O/lib/python3.11/site-packages/transformers/trainer.py", line 2784, in compute_loss
    outputs = model(**inputs)
              ^^^^^^^^^^^^^^^
  File "/.local/share/virtualenvs/macos-local-Q10oLr9O/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.local/share/virtualenvs/macos-local-Q10oLr9O/lib/python3.11/site-packages/transformers/models/whisper/modeling_whisper.py", line 1419, in forward
    outputs = self.model(
              ^^^^^^^^^^^
  File "/.local/share/virtualenvs/macos-local-Q10oLr9O/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.local/share/virtualenvs/macos-local-Q10oLr9O/lib/python3.11/site-packages/transformers/models/whisper/modeling_whisper.py", line 1268, in forward
    encoder_outputs = self.encoder(
                      ^^^^^^^^^^^^^
  File "/.local/share/virtualenvs/macos-local-Q10oLr9O/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.local/share/virtualenvs/macos-local-Q10oLr9O/lib/python3.11/site-packages/transformers/models/whisper/modeling_whisper.py", line 822, in forward
    inputs_embeds = nn.functional.gelu(self.conv1(input_features))
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.local/share/virtualenvs/macos-local-Q10oLr9O/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.local/share/virtualenvs/macos-local-Q10oLr9O/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 313, in forward
    return self._conv_forward(input, self.weight, self.bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.local/share/virtualenvs/macos-local-Q10oLr9O/lib/python3.11/site

-packages/torch/nn/modules/conv.py", line 309, in _conv_forward
    return F.conv1d(input, weight, bias, self.stride,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Mismatched Tensor types in NNPack convolutionOutput
  0%|

I believe I’m missing something in the Accelerate configuration. If anyone can help me, I would really appreciate it!

Thank you!

Topic		Replies	Views
Wav2Vec2.0 FineTuning distributed training 🤗Transformers	0	295	June 30, 2022
Multi GPU Audio Finetuning for Wav2vec2 Failing for 4 GPUs but successful for 1 GPU Beginners	0	213	July 9, 2023
Can I use CUDA with Trainer.train? Beginners	3	5794	May 10, 2022
ValueError: Mixed precision training with AMP or APEX (`--fp16` or `--bf16`) and half precision evaluation (`--fp16_full_eval` or `--bf16_full_eval`) can only be used on CUDA devices 🤗Transformers	0	1689	May 17, 2022
Wav2vec fine-tuning with multiGPU Models	16	6173	May 22, 2021

Tensor types mismatch when trying to enable GPU

Related Topics