I have a few questions I would like to ask to my favorite experts.
I have a big dataset with 50 millions of entries. Each entry is a text that could be assimilated to a tweet/midjourney prompt (between 50-100 words, sentence and/or keywords).
I would like to find the best LLM models to be able to generate new random entries. I still don’t know what model to use.
-
If I use GPT3+fine-tuning, I assume it’s state of the art but it will cost me a $$$$.
-
If I use huggingface,
- What model do I need to choose to have some good quality after fine-tuning?
- Is there a benchmark for all LLM models on this kind of tasks? (Not found on HELM)
- Is there a big difference between results after fine-tuning light models or big models (for example GPT2-XL vs Bloom 175 vs OPT66 vs GPT-NeoX)? Are there some studies about it?
-
If I want to run a light model on my Nvidia 3090 RTX, what model can I use? GPT2-Large? GPT2-XL? Bloom 1B? GPT-NeoX?
-
Maybe I do not need to use all these LLMs and I can use BERT models, like ROBERTa, distilbert, etc. It will make the job perfectly? Which one to use?
Thanks in advance for all your help! I definitively need it.