Speed up beam search for item generation

Hi together,

i am using the following code, to generate predictions from my trained model:

def generate_new_items(model, tokenizer, start_tensor, use_longer_words, num_return=100):
    if use_longer_words:
        tokens_per_item = 3
    else:
        tokens_per_item = 1

    beam_outputs = model.generate(
        start_tensor, 
        max_new_tokens=tokens_per_item,
        num_beams=num_return,
        num_return_sequences=num_return,
        early_stopping=True,
        pad_token_id=50256
    )
    new_item_tokens = beam_outputs[:, -tokens_per_item:]
    new_items = tokenizer.batch_decode(new_item_tokens, skip_special_tokens=True)

    if use_longer_words:
        new_items = [x.split(", ")[0].strip(", \n.") for x in new_items]
        new_items = [x.split(" ")[0].strip(", \n.") for x in new_items]
    new_items = [int(x) for x in new_items if not x == ""]

    return new_items

However, I find that the beam search used for generation of new items runs extremely slow.

I already found a similar problem here. As a solution they recommend to use DeepSpeed. Unfortunately, I do not understand how to do this.

Is there any straightforward method to speed up the beam search?

I would be grateful for any help. Thanks already!

1 Like

Hi,

I’m facing the same issue, have you found a solution for this?

Thanks!