Hi there, i was wondering should i train my model in line by line manner, what is the advantage of it?
The training time with --line_by_line with train_mlm.py is 2x compare to without it.
need some more context on what you are trying to achieve and how you are implementing it?
I have continue pre-trained a model on a bert base model for a new language without --line_by_line, as a linguistic statistical model it is able to predict a probability of a masked word, but i just curious what it the use case for a model trained with line_by_line? is it for LLM to capture more context?