I’m training an encoder-decoder model (BART) on my task. My task is spell checking, I framed it as a seq2seq task, so for example given the sentence
I liike thus framewok much., the model is trained to predict the corrected sentence :
I like this framework much..
Now at inference time, I would like to force the beginning of the decoder sentence. For example for the partial sentence
I like this framewok :
- I want to input
I like this framewok in the encoder
- I want to give
I like this in the decoder, and let the decoder predict the next word (in this case if the model is well-trained, it should predict
How can I achieve this goal with the
generate() method ?
I could get something to work by using the keyword argument
# Create the input for the encoder : the sentence with typo
encoder_inp = tokenizer(["I like this framewok"], max_length=model.config.max_position_embeddings, padding=True, truncation=True, return_tensors="pt")
# Create the input for the decoder : first part of the sentence, without typo
decoder_inp = tokenizer(["I like this"], max_length=model.config.max_position_embeddings, padding=True, truncation=True, return_tensors="pt")
# Remove the EOS token generated by the tokenizer
decoder_inp["input_ids"] = decoder_inp["input_ids"][:, :-1]
# Then call the generate method with the encoder's input AND decoder's initial input
out = model.generate(encoder_inp["input_ids"], num_beams=2, min_length=1, max_length=model.config.max_position_embeddings, decoder_input_ids=decoder_inp["input_ids"])
This seems to work as I expect it to, but I’d love to get a second opinion from someone more knowledgeable !
Hi @astariul – yup, using
decoder_input_ids is precisely what you should do