Text generation, LLMs and fine-tuning

Lolorent · December 8, 2022, 9:26pm

I have a few questions I would like to ask to my favorite experts.

I have a big dataset with 50 millions of entries. Each entry is a text that could be assimilated to a tweet/midjourney prompt (between 50-100 words, sentence and/or keywords).

I would like to find the best LLM models to be able to generate new random entries. I still don’t know what model to use.

If I use GPT3+fine-tuning, I assume it’s state of the art but it will cost me a $$$$.
If I use huggingface,
- What model do I need to choose to have some good quality after fine-tuning?
- Is there a benchmark for all LLM models on this kind of tasks? (Not found on HELM)
- Is there a big difference between results after fine-tuning light models or big models (for example GPT2-XL vs Bloom 175 vs OPT66 vs GPT-NeoX)? Are there some studies about it?
If I want to run a light model on my Nvidia 3090 RTX, what model can I use? GPT2-Large? GPT2-XL? Bloom 1B? GPT-NeoX?
Maybe I do not need to use all these LLMs and I can use BERT models, like ROBERTa, distilbert, etc. It will make the job perfectly? Which one to use?

Thanks in advance for all your help! I definitively need it.

Topic		Replies	Views
GPT 2.5-open source Models	2	598	November 12, 2020
Fine-tune, or train from scratch? Beginners	6	3528	September 16, 2020
Trying to choose a model/methodology (text generation) Beginners	0	420	April 14, 2021
Best practice for finetune LLM Intermediate	0	686	June 21, 2023
Choosing the correct model for generate response to a customer's reviews Models	0	618	June 12, 2023

Text generation, LLMs and fine-tuning

Related topics