GPU utlization up and down

Abdullah955x · May 13, 2022, 12:54am

I’m training wav2vec2 using transformers on 3 A100 GPU, however the utlization of the GPU are not 100% all the time it goes up and down with every batch.

is this normal ?
using the exact same command here

only by changing -nproc_per_node to 3

python -m torch.distributed.launch --nproc_per_node 3 run_speech_recognition_ctc.py --dataset_name="common_voice" --model_name_or_path="facebook/wav2vec2-large-xlsr-53" --dataset_config_name="tr" --output_dir="./wav2vec2-common_voice-tr-demo-dist" --overwrite_output_dir --num_train_epochs="15" --per_device_train_batch_size="4" --learning_rate="3e-4" --warmup_steps="500" --evaluation_strategy="steps" --text_column_name="sentence" --length_column_name="input_length" --save_steps="400" --eval_steps="100" --logging_steps="1" --layerdrop="0.0" --save_total_limit="3" --freeze_feature_encoder --gradient_checkpointing --chars_to_ignore , ? . ! - \; \: \" “ % ‘ ” � --fp16 --group_by_length --do_train --do_eval

Topic		Replies	Views
Wav2vec fine-tuning with multiGPU Models	16	6932	May 22, 2021
Multi GPU Audio Finetuning for Wav2vec2 Failing for 4 GPUs but successful for 1 GPU Beginners	0	307	July 9, 2023
Baffling performance issue on most NVidia GPUs with simple transformers + pytorch code Intermediate	5	4508	April 9, 2024
Training Transformer doesn't reach full GPU usage 🤗Transformers	0	532	February 10, 2023
Wav2Vec2 Fine Tuning Models	0	257	December 21, 2023

GPU utlization up and down

Related topics