Evaluate model at saved checkpoint

Hi, I’m using Seq2seq Trainer for QG and trying to see sample output of model at a checkpoint like so:

model = ProphetNetForConditionalGeneration.from_pretrained(‘squad_training_latest/results/checkpoint-212000’)
…
contexts = [tdata[‘context’][i]]
answers = [tdata[‘answer’][i]]
questions = [tdata[‘question’][i], tdata[‘question2’][i]]

    encoder_inputs, decoder_inputs = preprocess_batch(contexts, questions, answers)
    decoder_inputs = decoder_inputs.contiguous()

    question_ids = model.generate(encoder_inputs, early_stopping=False, return_dict_in_generate=False, eos_token_id=102, min_length = 64)
    rv = tokenizer.batch_decode(question_ids, skip_special_tokens=False)
    print(rv)

The result looks very bad, overfitted like [’[SEP] what was the the the the the name of the? [X_SEP] the the? name of the?’] for every example

however the eval loss was low so wondering if I’m evaluating right, or should I use Trainer.predict? load the model a different way?

(side question: I noticed Trainer.evaluate() returns teacher forced predictions to me and it differs from model.generate() and was wondering why and whether the param predict_with_generate has anything to do with it)