Issue in the Documentation of transformers for BiET

ba-sing-se · October 21, 2021, 6:26pm

Hello,

I have been reading the documentation of Biet model here. In the section pooler output this is what is written

Last layer hidden-state of the first token of the sequence (classification token) after further processing through the layers used for the auxiliary pretraining task. E.g. for the BERT-family of models, this returns the classification token after processing through a linear layer and a tanh activation function. The linear layer weights are trained from the next sentence prediction (classification) objective during pretraining

After going through the code (here)[https://github.com/huggingface/transformers/blob/master/src/transformers/models/beit/modeling_beit.py#L666-L667], the pooler output is actually the mean of all hidden states and not a linear projection on the CLS token.
Is it possible to update the documentation as it creates confusion while going through it?

Note: I thought of raising the issue in the GitHub repo but couldn’t find how to do it incase of documentation.

nielsr · October 21, 2021, 7:32pm

Hi,

Thanks for reporting. Indeed, BeitModel currently returns an output of type BaseModelOutputWithPooling. This is a generic class that automatically generates the documentation for the model, defined here. However, in this case, it might be better to define a custom BeitModelOutput, that better describes the outputs of the model.

Do you mind opening a PR for this? This would mean defining a new dataclass within modeling_beit.py.

Otherwise, I’ll do it

ba-sing-se · October 24, 2021, 9:04pm

Hello,

I raised a pull request here: Added Beit model ouput class by lumliolum · Pull Request #14133 · huggingface/transformers · GitHub

Can you tell me why the CircleCI - check code quality test is failing?

Topic		Replies	Views
BertModel.forward() output caveat removed? Models	6	651	September 5, 2020
Where to pick-up embedding data from BERT model? Models	2	881	February 8, 2022
Extracting the output of hidden BERT layers and re-training the BERT model on custom datasets 🤗Transformers	0	810	March 17, 2021
New model output types 🤗Transformers	7	5727	March 11, 2021
Some weights of BertModel were not initialized from the model checkpoint 🤗Transformers	6	11390	January 15, 2024

Issue in the Documentation of transformers for BiET

Related topics