🤗Tokenizers

Topic	Replies	Views	Activity
Encoding and then decodeing text is not equal	2	250	August 12, 2024
HugginChat (Android App)	0	68	August 8, 2024
Token Classification: How to tokenize and align labels with overflow and stride?	4	6244	July 22, 2024
Error with new tokenizers (URGENT!)	16	51696	July 22, 2024
Fine tuning a T5 model for translation - How do I apply my trained tokenizer to the target sentences?	0	55	July 20, 2024
When I using the chat_template of llama 2 tokenizer the response of IT model is nothing	0	124	July 13, 2024
Update encode function slowTokenizer vs FastTokenizer	0	63	July 12, 2024
How do I remove tokens from a BPE Tokenizer's vocabulary?	2	793	July 3, 2024
Problem with AutoTokenizer	1	259	June 24, 2024
Tokenizer vs Model	0	290	June 24, 2024
Exporting tokenizer to an onnx model	1	1483	June 23, 2024
`additional_special_tokens` are not added	1	499	June 20, 2024
Tokenizer splits words with accents into separate subwords	0	94	June 20, 2024
Emojis poisoning tokenizer	0	146	June 17, 2024
Modifying normalizer for pretrained tokenizers don't consistently work	2	131	June 12, 2024
Seq2SeqTrainer produces incorrect EvalPrediction after changing another Tokenizer	0	105	June 11, 2024
Use sentence-transformers/all-MiniLM-L6-v2 fully local	1	363	June 6, 2024
Get "using the `__call__` method is faster" warning with DataCollatorWithPadding	8	17576	June 3, 2024
Create entirely new vocabulary for tokenizer	0	132	May 30, 2024
Paligemma model Forward Method Not Returning Loss in Trainer #31045	0	167	May 26, 2024
BUGs on offset-mapping	0	190	May 24, 2024
How long to expect training to take, and guidance on subset size?	1	2157	May 23, 2024
Doubts about the tokenization strategy and the explanation of models through SHAP	0	241	May 22, 2024
Version incompatibility between transformers and tokenizers	0	1665	May 22, 2024
Can't load tokenizer using from_pretrained, Interface API	0	334	May 21, 2024
Unusual input_id size for distilBERT tokenizer	0	121	May 14, 2024
Unable to load saved tokenizer	0	285	May 14, 2024
Error loading tokenizer from local checkpoint directory	3	1657	May 13, 2024
Difference between tokenizer and convert_tokens_to_ids	0	345	May 12, 2024
Encode token without spaced between them	0	147	May 9, 2024