CUDA not working with asr pipeline

Hi,
I want to use CUDA to run a pipeline on several one minute audios. It seems that the GPU ( NVIDIA GeForce RTX 3080) is recognising the model, which occupies 3.5 gb in memory as indicated in the task manager. But then when I check the GPU usage It remains at 1% and the temperature does not increase. I have no idea why this might occur.
Below are the relevant packages that I am using and the code snippet:

  • torch 2.2.1+cu118
  • torchaudio 2.2.1+cu118
  • cuda driver 11.8
  • transformers 4.36.1
model_path = hf_downloader.download_model(modelo)
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
model = WhisperForConditionalGeneration.from_pretrained(
        model_path,
        torch_dtype=torch_dtype,
        low_cpu_mem_usage=True,
        use_safetensors=True)
device = "cuda:0" if torch.cuda.is_available() else "cpu"
model.to(device)
feature_extractor = WhisperFeatureExtractor.from_pretrained(model_path)
tokenizer = WhisperTokenizer.from_pretrained(model_path, language="spanish", task="transcribe")

asr_pipe = pipeline(
            "automatic-speech-recognition",
            model=model,
            feature_extractor=feature_extractor,
            tokenizer=tokenizer,
            chunk_length_s=10,
            stride_length_s=(4, 2),
            batch_size=16,
            torch_dtype=torch_dtype,
            device=device,
        )
data = load_from_disk(datapath) #audio dataset with 3000 audios
arrays = [data['audio'][i]['array'] for i in range(len(data))]
res = asr_pipe(arrays, return_timestamps=True)

Has anyone experienced a similar issue when trying to run ASR pipeline with CUDA?