Encoder-decoder `generate()` with forced start for decoder?

I’m training an encoder-decoder model (BART) on my task. My task is spell checking, I framed it as a seq2seq task, so for example given the sentence I liike thus framewok much., the model is trained to predict the corrected sentence : I like this framework much..

Now at inference time, I would like to force the beginning of the decoder sentence. For example for the partial sentence I like this framewok :

  • I want to input I like this framewok in the encoder
  • I want to give I like this in the decoder, and let the decoder predict the next word (in this case if the model is well-trained, it should predict framework).

How can I achieve this goal with the generate() method ?

1 Like

cc’ing @joaogante here

1 Like

I could get something to work by using the keyword argument decoder_input_ids :

# Create the input for the encoder : the sentence with typo
encoder_inp = tokenizer(["I like this framewok"], max_length=model.config.max_position_embeddings, padding=True, truncation=True, return_tensors="pt")

# Create the input for the decoder : first part of the sentence, without typo
decoder_inp = tokenizer(["I like this"], max_length=model.config.max_position_embeddings, padding=True, truncation=True, return_tensors="pt")

# Remove the EOS token generated by the tokenizer
decoder_inp["input_ids"] = decoder_inp["input_ids"][:, :-1]

# Then call the generate method with the encoder's input AND decoder's initial input
out = model.generate(encoder_inp["input_ids"], num_beams=2, min_length=1, max_length=model.config.max_position_embeddings, decoder_input_ids=decoder_inp["input_ids"])

This seems to work as I expect it to, but I’d love to get a second opinion from someone more knowledgeable !

1 Like

Hi @astariul – yup, using decoder_input_ids is precisely what you should do :+1:

2 Likes