How to fine-tune to 3 very different sized datasets (very large to very small)

anujn · February 24, 2023, 2:11am

Dear HF,
I have an interesting problem and I would love some advice on it:

I have 3 datasets - 1 very large (few GB), 1 medium (few hundred MB), 1 small (few MB)
I want to finetune a decoder only LLM on them

I want the model to generalise well from the large one (pick up general concepts), fit more to the medium (pick up some structure), then fit very closely to the small dataset (pick up the text structure well).

What is the best way to go about this?

Vary the learning rate?
Vary epochs?

If so any good starting points? Any information or advice would be much appreciated as I’m struggling to know where to start here!

All the best!

p.s.
The small dataset is only around 100 examples. Training the LLM on this for 3 epochs gives good results, I worry that more would result in too much overfitting. The large dataset could be huge in comparison!

Topic		Replies	Views
Train LLM Model using multiple datasets Beginners	0	778	July 28, 2023
Dataset size for fine-tuning Beginners	0	597	May 21, 2021
Is there a small (<5GB) dataset for general-purpose LLMs? Beginners	0	384	November 17, 2023
Finetuning a Large Language Model Intermediate	0	83	October 23, 2024
Help, please! Seems fine tuning on LLM is not working Beginners	4	1535	April 5, 2024

How to fine-tune to 3 very different sized datasets (very large to very small)

Related topics