I am training a custom tokenizer on is one long string. Can I parse it directly to a tokenizer model and it would use the EOS token when it is a “.” for example so that it understand the sentence transition? My question is: I do not need to split it into lines, correct?