How to force bos_token_id for each example individually in MBart?

mralexis · July 27, 2021, 12:26am

Say I have a batch of examples with fields of input_ids of size m*n and bos_token_id of size n. Is there a way that I could specify the bos_token_id for each example during the evaluation step when using generate?

nfortescue · January 26, 2022, 10:24am

I’m also curious about this. @mralexis - did you ever work this out? It seems like a similar question was also asked here: M2M model finetuning on multiple language pairs which also had no reply.

nfortescue · February 3, 2022, 12:00pm

I think I managed to do this, but my way of doing it is really hacky and fragile so I wouldn’t recommend it. I’ve filed a feature request with the huggingface transformers team to improve this at https://github.com/huggingface/transformers/issues/15500

That feature request has a link to a Colab notebook with the code for how I did it

KhaiKit · February 16, 2024, 2:53am

Hey @nfortescue,

I tried your code, it works when I’m just training. But seems like it runs into an error when I enable the evaluation during the training for the following code.
"max_length": self._max_length if self._max_length is not None else self.model.config.max_length, AttributeError: 'M2MSeq2SeqTrainer' object has no attribute '_max_length'

  # XXX: adapt synced_gpus for fairscale as well
  gen_kwargs = {
    "max_length": self._max_length if self._max_length is not None else self.model.config.max_length,
    "num_beams": self._num_beams if self._num_beams is not None else self.model.config.num_beams,
    "synced_gpus": True if is_deepspeed_zero3_enabled() else False,
  }

After changing the gen_kwargs, the issue was bypassed but subsequently there was another error TypeError: forward() got an unexpected keyword argument 'forced_bos_token_id' which arose from the following code line:

    with torch.no_grad():
      with self.autocast_smart_context_manager():
        outputs = model(**inputs)

Which I resolved by removing the ‘forced_bos_token_id’ temporarily from the inputs before calling the model to generate the output. However, would that mean that the bos token of the target sequence is now incorrect?

Topic		Replies	Views
Can we force first token by model.config.forced_bos_token_id? 🤗Transformers	0	659	April 12, 2022
How to prepare data for mBART50 multilingual (many-to-many) fine-tuning? Models	1	19	June 17, 2025
Force mBART to generate tokens in target language during backtranslation Models	0	489	March 22, 2021
Add custom constraint in generate() 🤗Transformers	0	574	September 6, 2023
`bos_token_id` has to be defined when no `input_ids` are provided Beginners	0	1240	January 10, 2022

How to force bos_token_id for each example individually in MBart?

Related topics