Why is there no pooler representation for XLNet or a consistent use of sequence_summary()?

I’m trying to create sentence embeddings using different Transformer models. I’ve created my own class where I pass in a Transformer model, and I want to call the model to get a sentence embedding.

Both BertModel and RobertaModel return a pooler output (the sentence embedding).

pooler_output ( torch.FloatTensor of shape (batch_size, hidden_size) ) – Last layer hidden-state of the first token of the sequence (classification token) further processed by a Linear layer and a Tanh activation function. The Linear layer weights are trained from the next sentence prediction (classification) objective during pretraining.

Why does XLNetModel not produce a similar pooler_output?

When I look at the source code for XLNetForSequenceClassification, I see that there actually exists code for getting a sentence embedding using a function called sequence_summary().

    def forward():
        transformer_outputs = self.transformer( ... )
        output = transformer_outputs[0]

        output = self.sequence_summary(output)

Why is this sequence_summary() function not used consistently in the other Transformers models, such as BertForSequenceClassification and RobertaForSequenceClassification?