Open ai whisper fine tuning on unknown language

When following this blog,

I used English as the tokenizer since my unknown language has english alphabets.

But in theis field, if I remove config.language

#model.generation_config.language = "hindi"
model.generation_config.task = "transcribe"

model.generation_config.forced_decoder_ids = None

How will I handle the start_token_id here?

data_collator = DataCollatorSpeechSeq2SeqWithPadding(
    processor=processor,
    decoder_start_token_id=model.config.decoder_start_token_id,
)

Can I just remove this part?

 #decoder_start_token_id=model.config.decoder_start_token_id,

``
1 Like