I want to perform a conditional generation with T5. My question is then, does model.generate() actually does conditional generation? Say that the desired sequence has three tokens “A B C”, and B is masked. So the model input is “A _ C”. In this case, when I call model.generate(input_ids(“A _ C”)), does the generation process takes in consideration that C is also given? I have doubt because according to the docs Generation, generate() works in an auto-regressive fashion, so I assume the prediction only depends on previous tokens, not the later tokens?
A related question is from reading the code at Using the T5 model with huggingface's mask-fill pipeline · Issue #3985 · huggingface/transformers · GitHub, which seems to be doing conditional generation (but of course, I have question about it). In particular, when I run the following code:
import torch
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
t5_tokenizer = T5Tokenizer.from_pretrained("t5-small")
t5_mlm = T5ForConditionalGeneration.from_pretrained("t5-small").to(DEVICE)
# Input text
text = "The <extra_id_0> walks in <extra_id_1> park"
input_ids = tokenizer("The <extra_id_0> walks in <extra_id_1> park", return_tensors="pt").input_ids.to(DEVICE)
# Generaing 20 sequences with maximum length set to 5
outputs = t5_mlm.generate(input_ids=input_ids,
num_beams=200, num_return_sequences=1,
max_length=20)
t5_tokenizer.decode(outputs[0][:], skip_special_tokens=False, clean_up_tokenization_spaces=False)
I get
'<extra_id_0> park offers<extra_id_1> the<extra_id_2> park.'
The part that says " the<extra_id_2> park." seems redundant to me, why is it generated? Thanks!