Running tokenizer on dataset: 28%|████████████████▍ | 111/393 [00:28<01:12, 3.88ba/s][WARNING|tokenization_utils_base.py:3048] 2021-11-11 10:46:54,553 >> Be aware, overflowing tokens are not returned for the setting you have chosen, i.e. sequence pairs with the ‘longest_first’ truncation strategy. So the returned list will always be empty even if some tokens have been removed.
Not sure if this is the right category for it.