Probaility of sequence generated in beam search of GPT2

I am using a huggingface model of type transformers.modeling_gpt2.GPT2LMHeadModel and using beam search to predict the text.

  • Is there any way to get the probability calculated in beam search for returned sequence.
  • Can I put a condition to return a text sequence only when it crosses some threshold probability.

Below code gives the 5 texts’ tokens but I need the probability of those 5 sequences.

test_prefix = “Is this someth”
test_input_ids = tokenizer.encode(test_prefix, return_tensors=‘pt’)
test_input_ids = test_input_ids.to(device)

model = GPT2LMHeadModel.from_pretrained(“some/local/path”)

test_beam_outputs = model.generate(
test_input_ids,
max_length=len(test_prefix.split(’ ')) + 6,
num_beams=5,
early_stopping=True,
length_penalty=0.5,
num_return_sequences=5,
no_repeat_ngram_size=2
)

1 Like