Best way to infer continuously with Transformer?

treeofknowledge · July 26, 2021, 12:11pm

Hi!
I’m looking for ways to infer w/ a Transformer model in a continuous manner — basically, I want it to retain some information about the previous sample in case it was part of the same text segment.

One approach I’m trying out now is inferring with intersecting windows (stride < length), and aggregating encoder embeddings of the overlapping part of the sequence (i.e. use information from window N to infer N+1). I use summing to aggregate instead of mean/dot product, as it gives the closest result to inferring as usual, but the result still doesn’t account for earlier context, meaning the approach doesn’t work.
Has this problem been addressed already? Is the typical solution to just increase input length bound? (What if I don’t have enough compute to train a model with large input lengths?)

Topic		Replies	Views
Is the way to input large size of text (over 512 words) exist? 🤗Transformers	0	948	October 27, 2021
How to use Huggingface model for continuous values directly? Models	2	1541	September 2, 2020
Feed output from one transformer model as input to another 🤗Transformers	1	1112	July 30, 2021
Temporal Information 🤗Transformers	0	306	October 8, 2020
Conceptual questions about transformers 🤗Transformers	10	1105	August 26, 2021

Best way to infer continuously with Transformer?

Related topics