I’m yusukemori.

While I check the model explanations in the pretrained_models list (,
I found that there seemed to be a mistake regarding BART.

Regarding facebook/bart-large-cnn, the explanation is as follows:

12-layer, 1024-hidden, 16-heads, 406M parameters (same as base)
bart-large base architecture finetuned on cnn summarization task

If my understanding is correct, are not 12-layer and (same as base) should be 24-layer and (same as large)?

I’m sorry if my understanding is wrong, or if someone has already noticed and fixed it.

It does seem wrong. Don’t hesitate to suggest a PR to fix this!

Thank you for checking my post and giving me advice!
I will suggest the PR as soon as possible!
(This will be my first PR to :hugs: Transformers. I’m worried, but I’m excited!)

