In modeling_whisper.py, the WhisperDecoder can take input_ids
or input_embeds
as its input. When I try passing input_embeds
of dimension (1, 1500, 384)
it faces an error inside WhisperPositionalEmbedding
as the max_context_length is 448
and I get an error of RuntimeError}. The size of tensor a (1500) must match the size of tensor b (448) at non-singleton dimension 1
. How to resolve this issue?