I am inheriting a mode pre-trained model:
class GPT2FinetunedWithNgrams(GPT2LMHeadModel):
@timer
def __init__(self, config, model_tokenizer=None):
super().__init__(config)
self.tokenizer = GPT2Tokenizer.from_pretrained('gpt2', padding_side='right')
self.tokenizer.pad_token = self.tokenizer.eos_token
and in the forward method during finetuning, I need to generate sequences from this model being finetuned:
def forward(
self,
input_ids=None,
past=None,
attention_mask=None,
token_type_ids=None,
position_ids=None,
head_mask=None,
inputs_embeds=None,
labels=None,
use_cache=True,
):
beam_output = self.generate(
input_ids,
max_length=50,
num_beams=5,
early_stopping=True)
#Pass beam_output to different loss function and return loss
My question is, will using the generate
method use the weights for the current model that is being finetuned or will it use static weights from some other GPT2 model?