I’m trying to work on a replication study of a paper done using OpenAI’s codex model, and I’m looking for the closest model to it available on the model hub. This NovelAI/genji-python-6B · Hugging Face is the best one I found, but after some testing it seems to often get fixated on code formatting more than the semantics of the program, is the model too small, trained on too little data or is there something better on the model hub?
I also checked out codeparrot lvwerra/codeparrot · Hugging Face but it appears to struggle on longer prompts.
Should I try with different hyperparameters?
Thank you for any suggestion!
Pinging @lvwerra who may have some ideas here based on his experience with CodeParrot
You could also try EleutherAI/gpt-j-6B · Hugging Face which was already trained on code and performs pretty well.
For the quality of generations it makes sense to tune the sampling strategy. If the first suggestion by the model should be good then you should go for low temperatures whereas if you have several tries you can increase the temperature to get more variety in the generation.
Thanks for the quick reply!
I think I didn’t quite understand the value of completed examples/solutions to include in the prompt, adding those improves performance significantly. Do you have any suggestions on literature to better understand sampling strategies/hyperparameters?
Btw I’m getting a build error on the code generation demo of CodeParrot, CodeParrot Generation - a Hugging Face Space by lvwerra, it worked fine earlier today though.