I would like to train GPT2 on wikitext from scratch (not fine-tune pre-trained model). I launched the following script in this folder.
Now I have two questions:
1- I was wondering if what I did is indeed a correct approach to train GPT2 from scratch?
2- I would like to know what hyperparameters I shoud use for this task? ( as far as I can tell, the suggested hyperparameters in existing examples in huggingface repo are for fine-tuning pre-trainned model)