When using the MarkupLMTokenizerFast there is an argument to return_overflowing_tokens. This makes sense since the model can only handle 512 tokens at a time. For example, if my data has 1024 tokens the tokenizer would return a tensor of size(2,512).
My question is, does the model consider the overflowing tokens when it’s training? If so can you point me to where it’s actually training on that data? It’s not clear to me that this is happening and want to make sure either way.