GPT2 for Punjabi
Pretrain GPT2 on Punjabi language to create a strong language generation model for Punjbai
Model
A randomly initialized GPT2 model
Datasets
One can make use of Kaggle Wikipedia Punjabi Articles dataset - Punjabi Wikipedia Articles | Kaggle
Available training scripts
A causal language modeling script for Flax is available here.
(Optional) Desired project outcome
The desired project output is a GPT2 model that is able to generate Punjabi language.