Hi,
I want to use CUDA to run a pipeline on several one minute audios. It seems that the GPU ( NVIDIA GeForce RTX 3080) is recognising the model, which occupies 3.5 gb in memory as indicated in the task manager. But then when I check the GPU usage It remains at 1% and the temperature does not increase. I have no idea why this might occur.
Below are the relevant packages that I am using and the code snippet:
- torch 2.2.1+cu118
- torchaudio 2.2.1+cu118
- cuda driver 11.8
- transformers 4.36.1
model_path = hf_downloader.download_model(modelo)
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
model = WhisperForConditionalGeneration.from_pretrained(
model_path,
torch_dtype=torch_dtype,
low_cpu_mem_usage=True,
use_safetensors=True)
device = "cuda:0" if torch.cuda.is_available() else "cpu"
model.to(device)
feature_extractor = WhisperFeatureExtractor.from_pretrained(model_path)
tokenizer = WhisperTokenizer.from_pretrained(model_path, language="spanish", task="transcribe")
asr_pipe = pipeline(
"automatic-speech-recognition",
model=model,
feature_extractor=feature_extractor,
tokenizer=tokenizer,
chunk_length_s=10,
stride_length_s=(4, 2),
batch_size=16,
torch_dtype=torch_dtype,
device=device,
)
data = load_from_disk(datapath) #audio dataset with 3000 audios
arrays = [data['audio'][i]['array'] for i in range(len(data))]
res = asr_pipe(arrays, return_timestamps=True)
Has anyone experienced a similar issue when trying to run ASR pipeline with CUDA?