There seems to be a mistake in documentation (pretrained_models.html) regarding BART


I’m yusukemori.

While I check the model explanations in the pretrained_models list (,
I found that there seemed to be a mistake regarding BART.

Regarding facebook/bart-large-cnn, the explanation is as follows:

12-layer, 1024-hidden, 16-heads, 406M parameters (same as base)
bart-large base architecture finetuned on cnn summarization task

If my understanding is correct, are not 12-layer and (same as base) should be 24-layer and (same as large)?

I’m sorry if my understanding is wrong, or if someone has already noticed and fixed it.

Thank you in advance.


It does seem wrong. Don’t hesitate to suggest a PR to fix this!

1 Like

Thank you for checking my post and giving me advice!
I will suggest the PR as soon as possible!
(This will be my first PR to :hugs: Transformers. I’m worried, but I’m excited!)

1 Like