I have trained a fine-tuned whisper model for an ultra low resource Malasar language.
While doing inference to evaluate the performance on a test set using the code, It performs decoding for few entries in data_test and then produces the error given in the title.
from tqdm import tqdm
from transformers.pipelines.pt_utils import KeyDataset
all_predictions = []
# run streamed inference
for prediction in tqdm(
pipe(
KeyDataset(data_test, "audio_path"),
max_new_tokens=1024,
generate_kwargs={"task": "transcribe"},
batch_size=8,
),
total=len(data_test),
):
all_predictions.append(prediction["text"])
What does that error indicate, and what could be the possible solutions?