I am using
LEDForConditionalGeneration for fine-tuning a summarization model. I am able to finetune the network and obtain good results for my purpose. But now, I want to modify the encoder to use some different architecture. How can I break the
model.generate() into individual components of the transformer?
For example, I want something like this:
outputs = model(input_ids) encoder_outputs = outputs["encoder_last_hidden_state"] #For encoder outputs # Similar logic for decoder outputs and then beam search to get text sentences
How can I obtain such encoder decoder outputs and obtain exactly the same result that I get with
model.generate? A sample code for
model.generate is as follows
ARTICLE_TO_SUMMARIZE = "Some long document that I want to summarize. inputs = tokenizer.encode(ARTICLE_TO_SUMMARIZE, return_tensors="pt") global_attention_mask = torch.zeros_like(inputs) global_attention_mask[:, 0] = 1 # Generate Summary summary_ids = model.generate(inputs, global_attention_mask=global_attention_mask, num_beams=3, max_length=32) print(tokenizer.decode(summary_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False))
Any help in this direction will be appreciated. Thanks!