I’ve seen the BigBirdPegasus and BigBird pages on transformers doc but don’t understand the difference.
BigBirdPegasus is like pretrained BigBird Encoder+pretrained BigBird Decoder then fine-tuned for summarization?
I’ve seen the BigBirdPegasus and BigBird pages on transformers doc but don’t understand the difference.
BigBirdPegasus is like pretrained BigBird Encoder+pretrained BigBird Decoder then fine-tuned for summarization?