Correct way to get pooled output of LXMertForPretraing

ihaelm · August 10, 2022, 6:57pm

I want to get the pooled output from the LXMertForPretraining class.
Since the keys provided in the LXMertForPretrainingOuput are these:

odict_keys(['prediction_logits', 'cross_relationship_score', 'question_answering_score', 'language_hidden_states', 'vision_hidden_states', 'language_attentions', 'vision_attentions', 'cross_encoder_attentions'])

None of those keys has the pooled output, so I am getting the pooled output this way:

visual_output = output['vision_hidden_states'][-1]
lang_output = output['language_hidden_states'][-1]
pooled_output = model_lxmert.lxmert.lxmert.pooler(lang_output)

This is consistent with what happens in the LXMertModel in here
https://github.com/huggingface/transformers/blob/main/src/transformers/models/lxmert/modeling_lxmert.py#L1004, which is basically this:

hidden_states = (language_hidden_states, vision_hidden_states) if output_hidden_states else ()
visual_output = vision_hidden_states[-1]
lang_output = language_hidden_states[-1]
pooled_output = self.pooler(lang_output)

But since, I am using the LXMertForPretraining class instead of the LXMert base class, I need to perform two forward passes. One to get the output of the LXMertForPretraining model and then I need to make this forward pass to get the pooled output.

Is this the correct way to get the pooled output from LXMetForPretraining or is there a better way to do it?
Thanks

Topic		Replies	Views
Longformer model does not return pooler_output Models	0	289	October 13, 2022
Difference between CLS hidden state and pooled_output? Beginners	0	1502	March 28, 2022
Issue in the Documentation of transformers for BiET 🤗Transformers	2	581	October 24, 2021
MaskedLMOutput does not have last_hidden_state 🤗Transformers	0	1627	May 27, 2021
Last hidden state vs pooler output in CLIPVisionModel Beginners	1	8510	November 18, 2022

Correct way to get pooled output of LXMertForPretraing

Related topics