Supervised Fine-tuning Trainer - where is the 'supervised' part?

marekst · July 3, 2023, 7:25pm

Hi,

I have been recently reviewing the Supervised Fine-tuning Trainer page and in the Quickstart section they mention about (supervised) fine-finetuning on imdb dataset (‘text’ field) which contains movie reviews. As a model they use AutoModelForCausalLM.from_pretrained(“facebook/opt-350m”). In this case, what exactly does it mean to fine-tune this model in a supervised fashion? As far as I know, the causal lm modeling is based on next token prediction so in case of fine-tuning it with imdb dataset do we simply continue the base model training and predicting the next word from the imdb text field input?

The same question refers to the ‘Format your input prompts’ section (i.e. instruction tuning) from the same page. In this case, when using a typical autoregressive, decoder based model, do we continue its training by providing the properly formated ‘input-response’ text (text = f"### Question: {example[‘question’][i]}\n ### Answer: {example[‘answer’][i]}") and its role is to predict next word/token from the text provided? So the difference between the first and the second case is simply the text input format?

Thanks!

Topic		Replies	Views
Should fine-tuning alway be 'Supervised Fine-tuning' Beginners	0	154	November 10, 2023
Finetune language model for feature extraction 🤗Transformers	0	395	July 1, 2021
What is the purpose of this fine-tuning? Beginners	3	289	October 13, 2021
Resume Training / Finetune a language model and further finetune a classifier Research	1	1278	October 19, 2020
Problems with understanding instruction fine-tuning Beginners	0	465	April 2, 2024

Supervised Fine-tuning Trainer - where is the 'supervised' part?

Related topics