ViT problem with GPU usage require image to be numpy

marcomameli01 · June 3, 2022, 7:56am

Hello,
I’m using ViT from your packages, in particular I’m using facebook/deit-base-patch16-384, when I try and test it on CPU all the workflow goes well but when I test it on GPU I receive this error:

Using GPU
Trainable param: 105521444
Traceback (most recent call last):
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\train.py”, line 97, in
pred_points = model_gcn(graph, pool)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\venv\lib\site-packages\torch\nn\modules\module.py”, line 1110, in _call_impl
return forward_call(*input, **kwargs)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\model\mesh_network.py”, line 29, in forward
features = pool(elli_points, self.feat_extr, self.transf)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\utils\pool.py”, line 16, in call
feat_conv3, feat_conv4, feat_conv5 = feat_extr(self.im)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\venv\lib\site-packages\torch\nn\modules\module.py”, line 1110, in _call_impl
return forward_call(*input, **kwargs)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\model\image\transformer.py”, line 14, in forward
inputs = self.feature_extractor(x[0], return_tensors=“pt”)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\venv\lib\site-packages\transformers\models\vit\feature_extraction_vit.py”, line 141, in call
images = [self.resize(image=image, size=self.size, resample=self.resample) for image in images]
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\venv\lib\site-packages\transformers\models\vit\feature_extraction_vit.py”, line 141, in
images = [self.resize(image=image, size=self.size, resample=self.resample) for image in images]
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\venv\lib\site-packages\transformers\image_utils.py”, line 218, in resize
image = self.to_pil_image(image)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\venv\lib\site-packages\transformers\image_utils.py”, line 104, in to_pil_image
image = image.numpy()
TypeError: can’t convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

I try to modify the image_utils.py file at the line 104 with a changes that add .cpu().clone().numpy() but when I insert the changes the error becomes:

Using GPU
Trainable param: 105521444
Traceback (most recent call last):
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\train.py”, line 97, in
pred_points = model_gcn(graph, pool)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\venv\lib\site-packages\torch\nn\modules\module.py”, line 1110, in _call_impl
return forward_call(*input, **kwargs)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\model\mesh_network.py”, line 29, in forward
features = pool(elli_points, self.feat_extr, self.transf)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\utils\pool.py”, line 16, in call
feat_conv3, feat_conv4, feat_conv5 = feat_extr(self.im)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\venv\lib\site-packages\torch\nn\modules\module.py”, line 1110, in _call_impl
return forward_call(*input, **kwargs)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\model\image\transformer.py”, line 16, in forward
outputs = self.model(**inputs, output_hidden_states=True)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\venv\lib\site-packages\torch\nn\modules\module.py”, line 1110, in _call_impl
return forward_call(*input, **kwargs)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\venv\lib\site-packages\transformers\models\vit\modeling_vit.py”, line 572, in forward
embedding_output = self.embeddings(
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\venv\lib\site-packages\torch\nn\modules\module.py”, line 1110, in _call_impl
return forward_call(*input, **kwargs)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\venv\lib\site-packages\transformers\models\vit\modeling_vit.py”, line 135, in forward
embeddings = self.patch_embeddings(pixel_values, interpolate_pos_encoding=interpolate_pos_encoding)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\venv\lib\site-packages\torch\nn\modules\module.py”, line 1110, in _call_impl
return forward_call(*input, **kwargs)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\venv\lib\site-packages\transformers\models\vit\modeling_vit.py”, line 191, in forward
x = self.projection(pixel_values).flatten(2).transpose(1, 2)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\venv\lib\site-packages\torch\nn\modules\module.py”, line 1110, in _call_impl
return forward_call(*input, **kwargs)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\venv\lib\site-packages\torch\nn\modules\conv.py”, line 447, in forward
return self._conv_forward(input, self.weight, self.bias)
File “D:\WorkingDirectory\UNIVPM\Progetti\ComputerGraphics\3DGen\myversion\pixel2mesh-geometric\p2m\venv\lib\site-packages\torch\nn\modules\conv.py”, line 443, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

So I don’t understand why if the image that I use already a torch tensor do you need to convert it to numpy array()

Can you help me please?

marcomameli01 · June 6, 2022, 7:18pm

Please someone can help me?

nielsr · June 18, 2022, 9:08am

Do you have a code snippet to reproduce your error?

marcomameli01 · June 24, 2022, 9:40pm

I can share my github code, but privately for the moment and you can reproduce the environment with the data. If you agree please let me know your mail.

Topic		Replies	Views
ViT Model increasing CPU RAM when moving to GPU 🤗Transformers	0	229	August 12, 2022
HuggingFace ViT 10x Slower than Native Tensorflow (Not Fully Using GPU?) 🤗Transformers	0	354	July 16, 2022
Can't Load ViT Model for Fine Tuning 🤗Transformers	2	1542	August 11, 2022
Use batching for ViLT predictions Beginners	1	330	February 4, 2022
Trying to run for the first time a model Beginners	0	821	March 8, 2023

ViT problem with GPU usage require image to be numpy

Related topics