Hugging Face Forums
Shortformer: Better Language Modeling using Shorter Inputs
Research
FL33TW00D
December 31, 2020, 10:02am
1
Interesting paper focusing on shorter context windows and improving training speed!
2 Likes
Related topics
Topic
Replies
Views
Activity
Model for big context window
Beginners
0
195
June 30, 2024
Are Word Embeddings by BERT generated for long sequences better than ones generated for short sequences?
🤗Transformers
0
238
March 29, 2022
Slow speed with large context
🤗Transformers
0
13
July 24, 2024
Context is all you need. Shifting focus for language modeling
Research
0
141
April 1, 2025
Maybe not to generate a word every time
🤗Transformers
0
112
April 19, 2023