Caching encoder state for multiple encoder-decoder `.generate()` calls?

jbarrow · April 10, 2024, 7:01pm

I’m using a VisionEncoderDecoderModel and I want to reuse the encoded image to decode multiple times (say, 30+ times per image). However, I don’t want to rerun the encoder every time I call model.generate(). Is there a way to cache the encoder state and reuse it? Or is there another efficient way to decode multiple times from the encoded input?

RaushanTurganbay · April 12, 2024, 7:41am

Hi! You can pass into “generate” an argument called “encoder_outputs” which will be used by the decoder then, instead of running encoder every time. Optionally you can pass in “decoder_input_ids”, otherwise it will be initialized from BOS token.

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tok = AutoTokenizer.from_pretrained("facebook/bart-base")
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/bart-base")

inputs_encoder = tok("Hello, my dog is cute", return_tensors="pt")
decoder_input_ids = tok("Bonjour", return_tensors="pt")["input_ids"]

encoder_outputs = model.get_encoder()(**inputs_encoder)
out = model.generate(decoder_input_ids=decoder_input_ids, encoder_outputs=encoder_outputs, num_beams=1, do_sample=False)
print(tok.batch_decode(out))

system · April 12, 2024, 7:41pm

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Control EncoderDecoderModel to generate tokens step by step 🤗Transformers	8	2600	June 8, 2022
Why model.generate does encoding multiple times 🤗Transformers	1	564	September 20, 2022
Avoid recalculating hidden states between generate calls? 🤗Transformers	3	1195	March 30, 2023
Cache T5 encoder results within batch when training 🤗Transformers	0	484	March 6, 2021
What is the purpose of 'use_cache' in decoder? 🤗Transformers	5	23703	July 4, 2023

Caching encoder state for multiple encoder-decoder `.generate()` calls?

Related topics