Clarification for the forward function of the SequenceSummary class from modeling_utils.py

h56cho · October 16, 2020, 2:53pm

Hello,
I have a question about the documentation strings provided for the forward function of the SequenceSummary class from modeling_utils.py:

github.com

huggingface/transformers/blob/dc552b9b7025ea9c38717f30ad3d69c2a972049d/src/transformers/modeling_utils.py#L1484


    self.activation: Callable = get_activation(activation_string) if activation_string else Identity()

    self.first_dropout = Identity()
    if hasattr(config, "summary_first_dropout") and config.summary_first_dropout > 0:
        self.first_dropout = nn.Dropout(config.summary_first_dropout)

    self.last_dropout = Identity()
    if hasattr(config, "summary_last_dropout") and config.summary_last_dropout > 0:
        self.last_dropout = nn.Dropout(config.summary_last_dropout)

def forward(
    self, hidden_states: torch.FloatTensor, cls_index: Optional[torch.LongTensor] = None
) -> torch.FloatTensor:
    """
    Compute a single vector summary of a sequence hidden states.

    Args:
        hidden_states (:obj:`torch.FloatTensor` of shape :obj:`[batch_size, seq_len, hidden_size]`):
            The hidden states of the last layer.
        cls_index (:obj:`torch.LongTensor` of shape :obj:`[batch_size]` or :obj:`[batch_size, ...]` where ... are optional leading dimensions of :obj:`hidden_states`, `optional`):
            Used if :obj:`summary_type == "cls_index"` and takes the last token of the sequence as classification

So when cls_index is not specified as the argument in SequenceSummary() statement, is the last token of the sequence used for the classification task? The entire sentence for the description is somewhat awkward…

Thanks,

Topic		Replies	Views
BertModel.forward() output caveat removed? Models	6	651	September 5, 2020
How can I change the forward function of BertForSequenceClassification Beginners	0	1873	August 14, 2021
Python nlp transformers library understanding the methods/functions/properties Beginners	0	557	December 29, 2021
The output of T5 is not consistent on multiple sequences 🤗Transformers	1	866	May 11, 2022
ValueError: expected sequence of length 133 at dim 1 (got 80) encountered when trying to retrieve first hidden state Beginners	0	624	June 24, 2021

Clarification for the forward function of the SequenceSummary class from modeling_utils.py

Related topics