i want to finetune a LLM on my own dataset that focus on specific subject
i have thousnds of text files about this subject
what is the required dataset format for LLM?
is it just text file?
i have converted some of them to Q&A in json format like this
{
Q:“Question”
A:“Answer”
ID:"QuestionID "
}