Can't read in audio files for transcription

tylersbarlow · June 20, 2024, 4:34pm

Hello, I am trying to import my own personal audio file to pass through a whisper pipeline and output the transcription. My code looks like this so far:

import torch
import gradio as gr
import soundfile as sf
from IPython.display import Audio
from transformers import pipeline
import numpy
import ffmpeg

device = “cuda:0” if torch.cuda.is_available() else “cpu”
pipe = pipeline(
“automatic-speech-recognition”, model=“openai/whisper-large-v3”, device=device
)

def transcribe_speech(filepath):
output = pipe(
filepath,
max_new_tokens=256,
generate_kwargs={
“task”: “transcribe”,
},
chunk_length_s=30,
batch_size=8,
)
return output[“text”]

transcribe_speech(“example2.mp3”)

Most of this was copied directly from the HF Audio course, but even though I’ve done
pip install ffmpeg
import ffmpeg

Whenever I try and run the code I get “ValueError: ffmpeg was not found but is required to load audio files from filename”

tylersbarlow · June 20, 2024, 4:35pm

I’ve tried online fixes such as downloading FFMPEG directly and placing the .exe file into my PATH, but nothing seems to be working

Svetlana0303 · June 29, 2024, 7:39am

I have the same issue!! Please, help

Topic		Replies	Views
Performing Whisper's "transcribe" with Transformer pipelines Beginners	2	2674	December 19, 2023
Whisper pipeline return_timestamps error Beginners	0	1522	March 4, 2023
PipelineIterator Issue 🤗Transformers	1	176	July 25, 2024
No output from ASR Pipeline using Whisper Beginners	1	1140	September 8, 2023
Difficulty running distil-whisper Beginners	0	125	May 30, 2024

Can't read in audio files for transcription

Related topics