Beam search does not reach the stopping criteria and causes cuda oom

Hello,

I use an MBart-based custom model and use beam_search for model generation.

I observed that depending on model training (more speficially depending on learning rate used to train the model), the model does not meet stopping criteria for beam_search or beam_scorer.is_done is always False so the code runs till it exhausts gpu memory.

Here’s the code snippet and parameters that I use for generation.

eval_generated = self.model.generate(input_ids=dev_input["input_ids"],
                                     attention_mask=dev_input["attention_mask"],
                                     decoder_start_token_id=decoder_start_token_id,
                                     forced_bos_token_id=forced_bos_token_id,
                                     bad_words_ids=bad_words_ids,
                                     num_beams=5,
                                     max_new_tokens=512,
                                     early_stopping=True,)

The weird thing is that this does not have consistent behavior. Depending on the learning rate I use, the model successfully generates outputs without cuda oom error. I have gpu A100 with 80GB+ memory so this shouldn’t be an issue…

Does anybody know how to fix the problem?

Thank you very much,

1 Like

I meet same problem ,do you solve it

1 Like