Hello, I am trying to import my own personal audio file to pass through a whisper pipeline and output the transcription. My code looks like this so far:
import torch
import gradio as gr
import soundfile as sf
from IPython.display import Audio
from transformers import pipeline
import numpy
import ffmpeg
device = “cuda:0” if torch.cuda.is_available() else “cpu”
pipe = pipeline(
“automatic-speech-recognition”, model=“openai/whisper-large-v3”, device=device
)
def transcribe_speech(filepath):
output = pipe(
filepath,
max_new_tokens=256,
generate_kwargs={
“task”: “transcribe”,
},
chunk_length_s=30,
batch_size=8,
)
return output[“text”]
transcribe_speech(“example2.mp3”)
Most of this was copied directly from the HF Audio course, but even though I’ve done
pip install ffmpeg
import ffmpeg
Whenever I try and run the code I get “ValueError: ffmpeg was not found but is required to load audio files from filename”