Re-training NLP model with training AND validation dataset after validation has been done

BramVanroy · November 25, 2021, 11:03am

It might work well, it might not. When you add data to your training run, the model will converge differently because it sees different data and optimizes accordingly. This might be good or bad (it is not a given that it will deterministically end up being better simply because you gave it more data), but the problem is that you simply cannot tell because you have no held-out set anymore. If you do wish to squeeze everything out of your data, cross validation is recommended.

Topic		Replies	Views
Retrain on whole Dataset? Beginners	0	400	October 21, 2021
Using the same dataset for fine-tuning and training Beginners	2	1528	May 7, 2022
Technical clarification on the validation data vs. the training data in the trainer API 🤗Transformers	1	756	January 6, 2022
Thoughts on quantity of training data for fine tuning Beginners	6	20299	March 10, 2022
Fine-Tuning Pre-trained Models Issues and Gotchas Beginners	2	600	March 26, 2021

Re-training NLP model with training AND validation dataset after validation has been done

Related topics