Fine tuning gpt-neo 2.7B with Lora on GSM8K - improve performance

Mdrnfox · June 9, 2025, 7:20pm

I’m working on a code output model and having to do something similar(ish), I’m stripping out all of the response unless the model begins with standard keywords like import, def, ```, etc. You could try something similar. Maybe make a system prompt always begin with an Answer that you then can splice out from the generation.

You could generate multiple outputs and only take the most common answer per question.

Other things to try: Use larger batch sizes via gradient accumulation and stable long-term gradient descent.

What would be cool is training the model to rederive the reasoning trace given the final answer and the question. That might be worth exploring if you got time.

Topic		Replies	Views
Fine tuning gpt-neo via ppo Research	1	1353	June 11, 2023
I need help getting more accurate results after training Beginners	0	54	August 25, 2024
Analyze the fine tuning result Models	2	31	February 18, 2025
GPT-NEO 1.3 always gives same output Beginners	0	13	August 22, 2024
Create a simple and reproducable training process for a GPT-like model? Beginners	1	262	December 27, 2023

Fine tuning gpt-neo 2.7B with Lora on GSM8K - improve performance

Related topics