What's the difference between bart-base tokenizer and bart-large tokenizer

zuujhyt · December 5, 2020, 2:49pm

Hi,

tokenizer1 = BartTokenizer.from_pretrained('facebook/bart-base')
tokenizer2 = BartTokenizer.from_pretrained('facebook/bart-large')

What’s the difference conceptually? I can understand the diff in uncased and cased ones for bert.
But why this?
btw, bart base and large have the same “vocab_size”: 50265 in their config.
Thanks.

shainaraza · December 5, 2020, 4:11pm

It is obviously related to more number of parameters used in the bart-large as mentioned in the description.
facebook/bart-large 24-layer, 1024-hidden, 16-heads, 406M parameters
facebook/bart-base 12-layer, 768-hidden, 16-heads, 139M parameters

zuujhyt · December 5, 2020, 4:54pm

Thanks for reply. but why is a tokenizer dependent on the number of model’s parameters? isn’t it just responsible for text tokenization for corpus and not related to model’s size?

BramVanroy · December 5, 2020, 5:01pm

Easy there with the “obviously”. This isn’t obvious, because as @zuujhyt rightfully says, the number of parameters is typically not directly related with the vocab. I.e. the vocab embeddings index often do not change between small/large models, but the model’s blocks get wider and/or deeper. I think this is a good question.

cc @patrickvonplaten

shainaraza · December 5, 2020, 5:13pm

Agree, would like to know more about it

julien-c · December 6, 2020, 5:14pm

Those tokenizers are identical. You can check it by just comparing the files over at https://huggingface.co/facebook/bart-base/tree/main and https://huggingface.co/facebook/bart-large/tree/main

Incidentally, they’re also the same as the ones for roberta-* models.

We duplicate tokenizers into their models for ease of use (a model id is all you need)

zuujhyt · December 6, 2020, 6:28pm

I understand, thanks!

Topic		Replies	Views
There seems to be a mistake in documentation (pretrained_models.html) regarding BART Site Feedback	2	645	October 26, 2020
Tokenizer vs. TokenizerFast 🤗Transformers	5	6862	August 12, 2021
BART Tokenizer tokenises same word differently? 🤗Tokenizers	1	722	August 24, 2022
Difference between checkpoints Beginners	0	400	February 21, 2023
[Bart] Bart model families' embedding shape? Beginners	0	229	May 13, 2023

What's the difference between bart-base tokenizer and bart-large tokenizer

Related topics