I’ve gone thru HuggingFace training on training a model with a string. But is there info/tutorial on how to do it with multiple files?
Do I need to get the docs into 1long string variable?
Do I need to split the files into single sentences instead of paragraphs?
Can I feed it one text file at a time and it continues to learn?
That’s a good question, and one that I can’t answer.
It depends on what kind of model you want, and many people are researching the best way to do it…
I think it would be quicker to ask about the progress so far in the nlp-related channel on HF Discord, ask-for-help, or general.