Hugging Face Forums
In Donut Where the output of swin diffused with the text->1.At the starting of Bart encoder,2. cross attention(K,V from swin,Q from attention) of second attention of Bart encoder,3.directly the decoder part of BART
🤗Transformers
shubham05
August 2, 2023, 8:28am
1
is it the same architecture AS follows
WhatsApp Image 2023-08-02 at 00.00.57
1280×654 85.4 KB
is it trained or test in same manner as follows
Screenshot 2023-08-02 003843
903×492 99.8 KB
Related Topics
Topic
Replies
Views
Activity
Cannot reproduce the results
Beginners
5
808
October 5, 2020
VisionEncoderDecoder X-Attn Question
🤗Transformers
4
358
June 20, 2022
How can one visualize the Cross-Attention of a VisionEncoderDecoderModel?
🤗Transformers
2
1145
November 7, 2023
AugsBERT with Cross-Encoder in both steps?
Beginners
0
206
September 19, 2022
How to visualize the features of encoder output of an encoder-decoder model?
Beginners
0
266
May 2, 2021