Seq2Seq model for distilgpt2

Hello everybody,

I want to do a summarization task (like here Summarization), but I want to use “distilgpt2” instead of “t5-small”. When replacing “t5-small” with “distilgpt2”, I get the following error:

ValueError: Unrecognized configuration class <class ‘transformers.models.gpt2.configuration_gpt2.GPT2Config’> for this kind of AutoModel: AutoModelForSeq2SeqLM.
Model type should be one of BartConfig, BigBirdPegasusConfig, BlenderbotConfig, BlenderbotSmallConfig, EncoderDecoderConfig, FSMTConfig, LEDConfig, LongT5Config, M2M100Config, MarianConfig, MBartConfig, MT5Config, PegasusConfig, PLBartConfig, ProphetNetConfig, T5Config, XLMProphetNetConfig.

My interpretation is, that there is no model for “distilgpt2” for seq2seq tasks. Is that the case or did I miss something?

Assuming there is no model I can use, do I have to do something like this:

Or is there an easier way to do it?

Thanks in advance

1 Like