Dialogue classification

I want to make a classification of a dialogue (client and assistant). What is the best way to encode the chat so that model knows whose speech where? Maybe to use some custom special tokens?