Trocr Model not utilising gpu even I am specified that

I am just using the basic code example for Trocr and sent it to the GPU but it does not use GPU but only CPU. It loads to the GPU memory but utilization to 0%.

here is the code:

import torch
import time
from PIL import Image
import requests
from transformers import TrOCRProcessor, VisionEncoderDecoderModel
print(torch.version.cuda)
# Load image from the IAM database (actually this model is meant to be used on printed text)
url = 'https://fki.tic.heia-fr.ch/static/img/a01-122-02-00.jpg'
image = Image.open(requests.get(url, stream=True).raw).convert("RGB")

# Detect available device correctly
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print(device)

t1 = time.perf_counter()

# Load processor and model
processor = TrOCRProcessor.from_pretrained('microsoft/trocr-small-printed')
model = VisionEncoderDecoderModel.from_pretrained('microsoft/trocr-small-printed').to(device)

# Transfer image pixel values to GPU if available
pixel_values = processor(images=image, return_tensors="pt").pixel_values.to(device)

# Generate text on the GPU
generated_ids = model.generate(pixel_values).to(device=device)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]

print(f"Generated text: {generated_text}")
print(f"Time taken: {time.perf_counter() - t1}")

# Sleep to observe GPU usage (optional)
time.sleep(10)

and logs

248, 0], [248, 76], [0, 76]], 'wlushe', 0.3743698593018324)
root@a100:~/Rayserver# python3 transformer_process_text.py 
12.1
cuda:0
Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration. Please open a PR/issue to update `preprocessor_config.json` to use `image_processor_type` instead of `feature_extractor_type`. This warning will be removed in v4.40.
/usr/local/lib/python3.10/dist-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  return self.fget.__get__(instance, owner)()
Some weights of VisionEncoderDecoderModel were not initialized from the model checkpoint at microsoft/trocr-small-printed and are newly initialized: ['encoder.pooler.dense.bias', 'encoder.pooler.dense.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:1137: UserWarning: Using the model-agnostic default `max_length` (=20) to control the generation length. We recommend setting `max_new_tokens` to control the maximum length of the generation.
  warnings.warn(
Generated text: UNPLUS THE
Time taken: 9.403108296999562

nvcc not installed and

/usr/share/cmake-3.22/Modules/FindCUDA/run_nvcc.cmake
root@a100:~/Rayserver# nvidia-smi 
Mon Apr 15 05:36:00 2024       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.125.06   Driver Version: 525.125.06   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GRID A100D-2-20C    On   | 00000000:06:00.0 Off |                   On |
| N/A   N/A    P0    N/A /  N/A |      0MiB / 20480MiB |     N/A      Default |
|                               |                      |              Enabled |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| MIG devices:                                                                |
+------------------+----------------------+-----------+-----------------------+
| GPU  GI  CI  MIG |         Memory-Usage |        Vol|         Shared        |
|      ID  ID  Dev |           BAR1-Usage | SM     Unc| CE  ENC  DEC  OFA  JPG|
|                  |                      |        ECC|                       |
|==================+======================+===========+=======================|
|  0    0   0   0  |      0MiB / 18411MiB | 28      0 |  2   0    1    0    0 |
|                  |      0MiB /  4096MiB |           |                       |
+------------------+----------------------+-----------+-----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
root@a100:~/Rayserver#