HuggingFace ViT 10x Slower than Native Tensorflow (Not Fully Using GPU?)

andrewvk · July 16, 2022, 12:01am

Any ideas what could be causing this?

I have been using another implementation of ViT and switching to the Transformers library with ‘google/vit-base-patch16-224-in21k’ it is training about 10x slower. It also takes almost 10 minutes for the epoch to move on.

Old:

CPU usage is around 15-25% and GPU is close to 100% most of training.

HF ViT:

CPU usage is similar but GPU is low and spikes to 25% or 50% or 100% very sporadically.

I tried to run them with as many of the same settings as possible. Larger batch size seems to result in less GPU usage for the HF ViT.

I think this might have to do with the image data loader (from this tutorial) not loading images as fast as the native Tensorflow ones, so the GPU cannot be fully utilized

Any help would be much appreciated.

Topic		Replies	Views
ViT Model increasing CPU RAM when moving to GPU 🤗Transformers	0	226	August 12, 2022
ViT problem with GPU usage require image to be numpy 🤗Transformers	3	661	June 24, 2022
GPU is far slower than CPU for patch embedding 🤗Transformers	0	358	June 8, 2024
Bigger batch size, the lower throughput and GPU usage？ 🤗Transformers	1	636	July 16, 2022
Training with Trainer really slow 🤗Transformers	0	1635	June 12, 2023

HuggingFace ViT 10x Slower than Native Tensorflow (Not Fully Using GPU?)

Related topics