Word ids of BioGPT model

siddharthtumre · May 7, 2023, 6:48am

As there is no Fast tokenizer availabale for BioGPT, I am not able to get the word_ids. Does anyone have any idea of how to get the word_ids?

nielsr · May 7, 2023, 5:33pm

Hi,

Topic		Replies	Views
GPT2: many bad_words_ids leading to slow text generation? Intermediate	0	1539	September 4, 2021
Issue with Flaubert Tokenizer as word_ids() method is not available for NER Task 🤗Tokenizers	1	1400	August 15, 2022
Get vocabulary tokens in order to exclude them from generate function 🤗Tokenizers	2	2644	August 1, 2022
How should I handle pre/post-processing with slow tokenizers for tasks like NER and question answering? 🤗Transformers	1	538	January 10, 2022
Speed issues using tokenizer.train_new_from_iterator on ~50GB dataset 🤗Transformers	7	2231	November 11, 2024