What is the purpose of this fine-tuning?

anon2764274 · October 13, 2021, 12:59am

Hi,

I found 🤗 Transformers Notebooks — transformers 4.12.0.dev0 documentation and then Google Colab .

The notebook will create examples which have the same text in the input and the labels. What is the purpose of such a model? Is it training some autoencoder task? I would think a more interesting challenge would be: Given input sample of text, have the label be the continuation of the sample of text.

Thank you,

wilornel

sgugger · October 13, 2021, 1:22am

As mentioned in the notebooks, the task is causal language modeling at first, so predict the next word. They also explicitly say that:

First note that we duplicate the inputs for our labels. This is because the model of the Transformers library apply the shifting to the right, so we don’t need to do it manually.

Which is why you see the same labels as the inputs.

anon2764274 · October 13, 2021, 3:46am

Does the causal model make sure to switch the attention set around when doing training?

sgugger · October 13, 2021, 11:37am

I am not sure what you mean by “switch the attention set”. It applies the attention mask to hide future tokens if it’s what you mean (otherwise you would see a perplexity of 0 or 1 at the end of training).

Topic		Replies	Views
How to fine-tune a model for my use-case? Beginners	0	653	July 13, 2023
How Labelled Data is Processed \| Transformers Trainer 🤗Transformers	10	4190	April 16, 2024
Does the transformers Trainer.train() automatically set positional attention masks? 🤗Transformers	4	507	June 3, 2024
Transformer similarity (fine-tuned on classification) too sensitive Models	2	648	March 6, 2022
Where does the Transformers do the target text shifting in causal LM? Beginners	4	4807	February 21, 2025

What is the purpose of this fine-tuning?

Related topics