I have a question related to general knowledge of HuggingFace’s set of libraries/tools (probably trl).
I want to fine-tune an LLM for my task (most likely, I will start with Llama-3.2-3B-Instruct). I want the model to be able to generate some “thoughts” or “comments” before it responds to the question with a structured answer, using CoT prompting. And it should be trained based on the answer only, which will be compared with the answers I got.
To exemplify, say I have dataset formatted as:
Input_prompt Answer
Some prompt here [word1, word2, ...]
And let’s say the model outputs:
In order to answer your question, I need to … (model’s comments)
… So, the final answer is: [word1, word2, ...]
Is there a way to fine-tune the model based only on model’s answer, which comes at the very end within the list? It doesn’t have to be within square brackets necessarily, the point is that the output should be a list of words.
I’m working on fine-tuning a large language model (likely Llama-3.2-3B-Instruct) using Hugging Face’s tools, and I have a question about achieving a specific behavior.
I want the model to generate “thoughts” or “comments” (via Chain of Thought (CoT) prompting) before arriving at its structured answer. The model’s final output should be a list of words (e.g., [word1, word2, ...]). Importantly, I’d like to train the model based only on this final structured answer and not the intermediate comments or thoughts.
Here’s an example of what I’m aiming for:
Input_prompt: Some prompt here
Answer: [word1, word2, …]
Model output:
In order to answer your question, I need to think about... (model-generated comments)
... So, the final answer is: [word1, word2, ...]
Question:
Using Hugging Face’s tools (likely transformers, datasets, or trl), how can I fine-tune the model to:
Ensure the final output includes the structured answer (list format) after generating intermediate comments?
Train the model based solely on the final answer (ignoring the intermediate comments in the loss computation)?
If you’ve worked on similar scenarios or have pointers to relevant methods, libraries, or examples, I’d appreciate the guidance!