Hi all
According to the Hugging Face documentation, this is the format for SFT: human: ... bot: ...
However, according to the example dataset the string
in the text
column should be in the following format:
### Human: ... ### Assistant: ...
When opening a new AutoTrain project, the following column mapping is requested: {“text”: “text”}
So, my questions:
- Should it be
### Human: ... ### Assistant: ...
orhuman: ... bot: ...
? (human or Human, Assistant or bot, case-senstive, with/out ###?) - I also so in the example dataset a chaining of human-assistant pairs. In what case would I use this instead of just one pair?
- What do I do if I want to add context? Basically, I want to fine-tune a model in order to create a custom chatbot that is fine-tuned on conversations. Therefore, I need to input not only a single question and and answer but rather add all the conversation that preceded a specific question as context.
- Is there a way to input a chain-of-thought input in the form of a JSON? For example:
{"conversation_context": conversation_context, "customer_question": customer_question, "assistant_response": assistant_response}
?
Many thanks