Hugging Face Forums
About parameter sharing in t5-v1.1
Models
xwwwww
March 12, 2022, 7:41am
1
I’m kind of confused about why t5-v1_1 disable parameter sharing. What is this designed for?
Related topics
Topic
Replies
Views
Activity
BigBirdPegasus with attention_type="original_full" vs T5
🤗Transformers
0
256
March 11, 2022
Gradient overflow when fine-tune t5 on CNN/DM dataset
Beginners
5
1691
September 3, 2020
T5forConditionalGeneration
Beginners
2
2283
September 15, 2020
Why is BART so much slower than T5?
🤗Transformers
0
351
September 17, 2021
Attention and hidden state details from t5
Beginners
0
198
November 12, 2020