Output embedding from each self-attention head from each encoder layer

Sreyan · February 28, 2022, 7:41am

Hi there!

I wanted the embeddings from each self-attention head of each encoder layer for one of my projects, is this possible with the hugging face library?

If not, can I just slice the original embeddings from each layer (suppose 768/12 = 128 size slice) to get the attention head output?

Thank You

Topic		Replies	Views
Regarding outputs in Encoder Beginners	0	227	April 10, 2022
Separate pre-trained encoder and decoder Models	0	437	October 4, 2023
Make ModelForLinearTransformation available as a generic head for all model types? 🤗Transformers	5	558	December 1, 2022
Bert - missing layer norm and resudual after attention block Models	4	1504	October 25, 2023
Self-attention masking for T5 encoder? 🤗Transformers	0	1700	February 27, 2022