How to add special tokens to a pretrained model?

anon58275033 · June 18, 2021, 2:48pm

Hi,

However, I am wondering, how do I add special characters to the tokenizer?

For example, accents such as the following: é, à, è, ù, â, ê, î, ô, û, etc.

Thanks.

Topic		Replies	Views
How to customize behavior of added special tokens in a pretrained tokenizer? Intermediate	0	605	May 5, 2021
How to add all standard special tokens to my tokenizer and model? Beginners	1	5896	August 11, 2022
How to add a new special token and initialize its embeddings to random values? Beginners	0	275	October 19, 2022
Adding a new mask_token for BERT-like models/tokenizers Intermediate	0	546	May 26, 2023
How to "further pretrain" a tokenizer (do I need to do so?) 🤗Tokenizers	5	4398	February 20, 2022