Hi,
I have just trained my own tokenizer from scratch, which is a Word Piece model like BERT, and I have saved it.
From there, I am now wanting to train my own language model from scratch using the tokenizer I trained beforehand.
However, referring to the code below, what do I change my model_checkpoint
to?
model_checkpoint = "gpt2"
tokenizer_checkpoint = "drive/wordpiece-like-tokenizer"
I trained a Word Piece model like BERT, so should gpt2
be changed to something else?
Thanks.