Best aproach to fine tune a GPT model for feature extraction #24779

Lukee4 · July 12, 2023, 3:08pm

I am trying to use BioGPT as a feature encoder and I want to compare if fine-tuning is going to improve the quality of the embeddings.

So I have two options the first is to fine-tune BioGPT without passing the labels and then use the last token of the last hidden state for classification using a separate machine-learning model. (Is it possible to fine-tune BioGPT as an encoder with the labels? Do the labels make any difference since the model is not attempting to classify?)

The second option would be to use BioGptForSequenceClassification which has a sequence classification head on top (linear layer) and fine-tune this by passing the labels to the model, I can then use this fine-tuned model for the classification or use the last token of the last hidden state for classification using a separate machine learning classifier.

Topic		Replies	Views
Fine-tuning Bio-Clinical Bert model Models	0	1198	September 28, 2023
Finetuning for feature-extraction? I.e. unsupervised fine tuning? Intermediate	10	5546	June 25, 2023
Adding special tokens to GPT2 for fine-tuning Models	0	659	July 6, 2023
Embeddings from fine-tuned ModelForSequenceClassification 🤗Transformers	0	64	August 9, 2024
How to fine-tune a pre-trained model and then get the embeddings? Beginners	2	3746	December 20, 2022

Best aproach to fine tune a GPT model for feature extraction #24779

Related topics