When to use AutoModelForSeq2SeqLM?

sinduhrsh · July 21, 2022, 6:15am

Can anyone tell me when to use AutoModelForSeq2SeqLM. Is it generally used for translation? Can I use AutoModelForSeq2SeqLM for fine tuning a custom task using t5 model. If not when to use AutoModelForSeq2SeqLM and T5ForConditionalGeneration?

When to use

AutoModelForSeq2SeqLM.from_pretrained(‘t5-base’)
T5ForConditionalGeneration.from_pretrained(‘t5-base’)

nielsr · July 21, 2022, 11:21am

Hi,

AutoModelForSeq2SeqLM can be used to load any seq2seq (or encoder-decoder) model that has a language modeling (LM) head on top. These include BART, PEGASUS, T5, etc. You can check the full list of supported models in the docs: Auto Classes

So when you do AutoModelForSeq2SeqLM.from_pretrained(‘t5-base’), it will actually load a T5ForConditionalGeneration for you behind the scenes.

bdzyubak · June 5, 2024, 9:30pm

Can you recommend a model to extract specific values from a large set of text? I am trying to do the SROIE dataset task2 - convert OCR-extracted text from receipts into total amount spent and company where the purchase was made. I am able to get ChatGPT-4 to extract the info quite nicely, but with t5-base, so far, I just get:
'<pad> True</s>'

Data is bounding box coords + extracted text.

190,864,309,864,309,880,190,880,EXCHANGEABLE
142,883,353,883,353,901,142,901,***
137,903,351,903,351,920,137,920,***
202,942,292,942,292,959,202,959,THANK YOU
163,962,330,962,330,977,163,977,PLEASE COME AGAIN !
412,639,442,639,442,654,412,654,9.00

The prompt I tried is:

input_text = f"Context: {the_data_above}\n\nQuestion: What is the total amount spent?\n\nAnswer:"

Is t5 a good model to use, or is there a better one? Any special tricks needed to getting it to use bounding box coordinates to infer text relationships?
Many thanks!

nielsr · June 10, 2024, 9:15am

Hi,

T5 is a model meant to be fine-tuned, it is very limited in the zero-shot setting.

As SROIE is a document AI task, I recommend to take a look at the models we offer for that: Accelerating Document AI.

Topic		Replies	Views
Training AutoModelForCausalLM in a Seq2Seq task 🤗Transformers	0	325	June 25, 2023
Flan-T5 / T5: what is the difference between AutoModelForSeq2SeqLM and T5ForConditionalGeneration Models	5	7390	February 2, 2023
Implementation source code for AutoModelForSeq2SeqLM Beginners	0	977	January 5, 2022
T5 trained with seq2seq method 🤗Transformers	0	293	June 26, 2023
Fine tuning T5 Encoder and T5 Decoder separately 🤗Transformers	1	737	May 6, 2024

When to use AutoModelForSeq2SeqLM?

Related topics