Shortformer: Better Language Modeling using Shorter Inputs

Interesting paper focusing on shorter context windows and improving training speed!

2 Likes