Hi!
I’m currently using the MairanMTModels to generate translations from English to another language. My goal is to generate 900 translations from a single English sentence.
I was reading this other blog post (How to generate text: using different decoding methods for language generation with Transformers) and it mentions:
" In transformers
, we simply set the parameter num_return_sequences
to the number of highest scoring beams that should be returned. Make sure though that num_return_sequences <= num_beams
!"
Currently, I’m able to generate 900 translations with a beam size of 3 and I was wondering why is this possible? If I do a beam size of 900, I run into memory issues. I’m curious to know what’s allowing me to generate 900 sequences even though my beam size is much lower.
I appreciate the help!