What happens in the MT5 documentation example?

Skylixia · January 10, 2021, 10:27am

Hi,
I’m trying to understand the provided example to the MT5 model but have some difficulties.

Here is the example:
from transformers import MT5Model, T5Tokenizer
model = MT5Model.from_pretrained(“google/mt5-small”)
tokenizer = T5Tokenizer.from_pretrained(“google/mt5-small”)
article = “UN Offizier sagt, dass weiter verhandelt werden muss in Syrien.”
summary = “Weiter Verhandlung in Syrien.”
batch = tokenizer.prepare_seq2seq_batch(src_texts=[article], tgt_texts=[summary], return_tensors=“pt”)
outputs = model(input_ids=batch.input_ids, decoder_input_ids=batch.labels)
hidden_states = outputs.last_hidden_state

So I understand that tokenizer.prepare_seq2seq_batch is to encode the input to provide to the model. It is a BatchEncoding containting the input_ids, labels and attention_mask.
However, I don’t understand what follows, what happens in : model(input_ids=batch.input_ids, decoder_input_ids=batch.labels) ? This does not train or fine tune the model but what does it do ?
Why do we provide it a source and target then ? What if we wanted the model to generate the target (summary) ?

Thanks !

Topic		Replies	Views
Issue with finetuning a seq-to-seq model 🤗Transformers	30	3960	August 11, 2022
<extra_id> when using fine-tuned MT5 for generation Beginners	9	3063	April 15, 2024
The output of T5 is not consistent on multiple sequences 🤗Transformers	1	870	May 11, 2022
Is there a way to return the "decoder_input_ids" from "tokenizer.prepare_seq2seq_batch"? 🤗Transformers	5	3355	December 29, 2020
T5 [ input & target ] text Beginners	1	616	December 27, 2021

What happens in the MT5 documentation example?

Related topics