Beam search does not reach the stopping criteria and causes cuda oom

kangje · November 13, 2023, 1:03pm

Hello,

I use an MBart-based custom model and use beam_search for model generation.

I observed that depending on model training (more speficially depending on learning rate used to train the model), the model does not meet stopping criteria for beam_search or beam_scorer.is_done is always False so the code runs till it exhausts gpu memory.

Here’s the code snippet and parameters that I use for generation.

eval_generated = self.model.generate(input_ids=dev_input["input_ids"],
                                     attention_mask=dev_input["attention_mask"],
                                     decoder_start_token_id=decoder_start_token_id,
                                     forced_bos_token_id=forced_bos_token_id,
                                     bad_words_ids=bad_words_ids,
                                     num_beams=5,
                                     max_new_tokens=512,
                                     early_stopping=True,)

The weird thing is that this does not have consistent behavior. Depending on the learning rate I use, the model successfully generates outputs without cuda oom error. I have gpu A100 with 80GB+ memory so this shouldn’t be an issue…

Does anybody know how to fix the problem?

Thank you very much,

divolcter · November 5, 2024, 8:41am

I meet same problem ，do you solve it

Topic		Replies	Views
Model.generate() is extremely slow while using beam search 🤗Transformers	2	5391	July 24, 2022
Constrained Beam Search - Very Slow 🤗Transformers	1	794	June 30, 2024
Beam_search and generate are not consistent 🤗Transformers	0	497	May 10, 2022
CUDA out memory only when performing hyperparameter search 🤗Transformers	1	943	January 26, 2022
Stopping criteria for batch 🤗Transformers	7	4163	April 5, 2024

Beam search does not reach the stopping criteria and causes cuda oom

Related topics