Combine multiple embeddings from different authors

I have the following text regression problems: I have a series of comments from social network users and for each user i have a continuous toxic user value i want to predict.

How can I exploit the comments in order to train a transformer that gives a full representation of the user?

I could assume that all the comments from a single user have the same hate speech value: like all the comments by user “John Doe” have an hate speech value of 50, since 50 is the toxicity value associated to that user.

But that would be a simplification: I was thinking about a trainable transformer architecture that for each user takes into account all the comment by that user and then merge them together (maybe using a CNN or something similar) into a single embedding representation for the user to be used fo regression task.

Is there already an architecture that does this thing?

Thanks in advance