GPU is far slower than CPU for patch embedding

ViGeng · June 8, 2024, 9:31am

Hello guys!
I am playing around with ViT inference speed. I have tested the cost time of embedding and encoding on CPU and GPU separately.
The results, out of my expectation, are:

time spent(ms)	GPU	CPU
embedding	331	1
encoding	4	72

details:
GPU model = RTX 3090Ti
CPU model = Intel i9-12900KF
Pretrained model weights = google/vit-base-patch16-224-in21k

I can understand that GPU is faster than CPU for encoding. But why CPU is faster than GPU for embedding since both embedding and encoding are some DL neural networks and do matrix multiplying operations?
Thanks a lot!

Topic		Replies	Views
Processong speed for text embedding models Models	0	169	April 5, 2024
ViT produces different embeddings each time? Models	0	274	July 10, 2023
HuggingFace ViT 10x Slower than Native Tensorflow (Not Fully Using GPU?) 🤗Transformers	0	347	July 16, 2022
Speed expectations for production BERT models on CPU vs GPU? Beginners	1	2154	October 2, 2020
Different Inference Speed for same size models Models	0	389	August 29, 2021

GPU is far slower than CPU for patch embedding

Related topics