Hello there! First of all, i am a internt in a financial corp who works on a R&D project. My goal is that train an AI that can help customers and able to reasoning. It is basically classic app assistant but i need it to make it reasoning. I tried unsloth, grpo, lora and feed it with a data-set that i made with the app’s spesific button mapping. But it keep responding absurd answers. So if there are anyone that want to help, i can provide more information about process. Thank you in advance.
1 Like
It seems that the know-how for Reasoning LLM training is also becoming quite well-established, and the course is about to be released.
There are many people on HF Discord who have know-how regarding LLM training, so if you have any specialized questions, it’s quicker to ask them there.
1 Like