Hi All,
I was fortunately able to write a GPT based model from scratch and it is below
I am really happy that it is able to produce coherent sentences but it is not able to write full fledged logical sentences and is hallucinating sometimes. I think I have to write and train a bigger model.
Currently the responses are not so logical and I am planning to pump up the parameters and number of transformer blocks(currently only 6), GPT2 used 12, and even the embedding size is less, just 384. Need to pump it up to alteast around 900 I am feeling. Please tell me if it will work! Currently I feel it will work!
I am also looking to get into a job which helps me grow in this field. I have used runpod to train the model under 20$ trust me. I also finetuned it on a question answer data set.
Feel free to contact me and reply on this thread!
Thanks