How to parallel infer multiple input sentences with beam search = 4?

hongyeliu · October 20, 2024, 2:30pm

I was wondering if it can infer multiple input sentences with beam search = 4 right now.

for example, generate the output parallel:
`import torch
from transformers import BartForConditionalGeneration, BartTokenizer
from tqdm import tqdm

device = torch.device(“cuda” if torch.cuda.is_available() else “cpu”)

model_name = ‘facebook/bart-large-cnn’
tokenizer = BartTokenizer.from_pretrained(model_name)
model = BartForConditionalGeneration.from_pretrained(model_name).to(device)

model.eval()

input_sentences = [
“The sun rises in the east and sets in the west.”,
“Artificial intelligence is transforming the way we live.”,
“The ocean is vast and full of mysteries.”
]

for input_text in tqdm(input_sentences):
# Tokenize the input text
inputs = tokenizer(input_text, return_tensors=‘pt’, max_length=1024, truncation=True).to(device)

# Generate sentences
with torch.no_grad():
    outputs = model.generate(
        inputs['input_ids'],
        max_length=100,  # Adjust based on your desired sentence length
        num_return_sequences=5,  # Generate 1 sentence for each input
        num_beams=5,  # Beam search for better quality
        early_stopping=True
    )`

Batch generation with beam search >1 would save time.

It seems like it was unable to do that before, but I am not sure whether there is a better way to do that right now.

Topic		Replies	Views
BART - Input two sentences? Beginners	2	728	June 13, 2022
How can I sample with BART for conditional generation? Beginners	1	979	August 19, 2021
Language generation with torchscript model? 🤗Transformers	6	2536	November 20, 2021
BART Paraphrasing Beginners	6	3079	February 18, 2022
BART model.generate() RuntimeError: Boolean value of Tensor with more than one value is ambiguous Beginners	1	2461	April 28, 2022

How to parallel infer multiple input sentences with beam search = 4?

Related topics