Let's think about BERT pair classification

BERT positional encoding max is 512 tokens.
If I have data which is 512 tokens size and make pair of it for making pair similarity check.

BERT paper says input would be [cls] first sequence [sep] second sequence.

that means first sequence + second sequence combined should be not over 512.

then one sequence would be 256 sequences only.

My question is if I have data which is 512 tokens, and wanna make pair similiarity classifier

then what should I do?