How to constrain mBart decoding to generate English-only output?

I use mbart conditional generation model from huggingface (here is the link). I use the model to finetune for a multilingual translation task (not exactly a translation task but for the sake of simplicity, I’ll call it a translation task).

I’ve noticed that to translate, for example, from Chinese to English, the model simply copies Chinese input textes to the output, which is an undesirable output for my task (having Chinese characters in the middle of the output text). I want to constrain the model’s decoding so that it only generates subwords with English alphabets.

I use forced_bos_token_id to tell the model to begin the translation with a specific language. However, it constrains only the beginning of the translation and not throughout the entire translation.

model.generate(input_ids=dev_input['input_ids']
               , attention_mask=dev_input['attention_mask']
               , forced_bos_token_id=forced_bos_token_id
               , num_beams=5
               )

I also gave a look at ‘Constrained Decoding’ (here). With this, we can constrain the decoder to generate an output text so that it includes certain token(s) but it does not work to exclude certain tokens (none-English alphabets).

Does anybody know how to force model to generate outputs with English-only characters during the decoding?