Reproducible model between SetFit Versions?

Dodirama · October 15, 2024, 4:49am

Previously I trained a SetFit model using SetFit v0.6.0 (py 3.8)

I have followed the migration steps and have refactored code to move to SetFit v1.0.3 (py 3.11).

I have used the same hyperparameters, and the same random seeds and the same training data.

Output of the v1.0.3 model is reproducible: the same model is created each time.
Output of the v0.6.0 model is also reproducible.

The two models are not the same, and their outputs are significantly different.

Is this expected?

John6666 · October 15, 2024, 5:17am

I don’t use the NLP model much, but I think it’s normal for the output to be different between different versions of the AI model for more than a year…
In a normal program, even a 10 year old code can work, but in AI, it can change in 6 months.

But if they seem too different, maybe some default parameter has been changed or something. The output can change quite a bit depending not only on the version of the model, but also on the version of the library.

Dodirama · October 18, 2024, 1:55am

@tomaarsen @lewtun do you have any insights?

Dodirama · November 28, 2024, 11:23pm

We are sharing this in case it helps anyone understand why the two SetFit versions don’t produce the exact same models.

We were unable to produce the exact same models using SetFit v0.6.0 and SetFit v1.0.3. As part of our investigation, we noticed that there were several factors that lead to different fine-tuned models between these SetFit versions:

In ouroriginal post I stated that each model was reproducible, but actually that was not the case when we started problem solving. Even though we had set a seed in setfit.SetFitTrainer() (v0.6.0) and setfit.TrainingArguments() (v1.0.3), the SetFit model’s head was being initialised with random weights at the start of every training run. This meant that the training script produced a different fine-tuned model after each run. This issue was resolved by adding a transformers.trainer_utils.set_seed() call before calling the SetFitModel.from_pretrained() function.

Having done this we got to the state we were in when we posted. The following explanation is the reason we found for the difference in model outputs.

The SetFit model training process also involves creating positive and negative sentence pairs. We noticed that the sampling methods for the two SetFit versions are different and have different logics. SetFit v1.0.3 uses the shuffle_combinations() function and ContrastiveDataset() class in sampler.py to generate and select pairs whereas SetFit v0.6.0 uses the sentence_pairs_generation() function in modeling.py to generate and select pairs.

There may have been some other factors causing this discrepancy as well.

benstokes · November 29, 2024, 5:16am

Dodirama:

In ouroriginal post I stated that each model was reproducible, but actually that was not the case when we started problem solving. Even though we had set a seed in setfit.SetFitTrainer() (v0.6.0) and setfit.TrainingArguments() (v1.0.3), the SetFit model’s head was being initialised with random weights at the start of every training run. This meant that the training script produced a different fine-tuned model after each run. This issue was resolved by adding a transformers.trainer_utils.set_seed() call before calling the SetFitModel.from_pretrained() function. nced

Having done this we got to the state we were in when we posted. The following explanation is the reason we found for the difference in model outputs.

The SetFit model training process also involves creating positive and negative sentence pairs. We noticed that the sampling methods for the two SetFit versions are different and have different logics. SetFit v1.0.3 uses the shuffle_combinations() function and ContrastiveDataset() class in sampler.py to generate and select pairs whereas SetFit v0.6.0 uses the sentence_pairs_generation() function in modeling.py to generate and select pairs.

There may have been some other factors causing this discrepancy as well.

Thanks for your info. It works.

system · November 29, 2024, 5:16pm

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Setfit trainer doesn't appear to be changing any weights Beginners	1	34	June 8, 2025
How can I speed up the setfit model 🤗Transformers	0	556	January 2, 2024
Comparing output of BERT model - why do two runs differ even with fixed seed? Beginners	2	649	January 18, 2022
Few-shot learning vs Fine-Tuning Research	0	1789	May 26, 2023
Untrained models produce inconsistent outputs 🤗Transformers	3	1161	July 30, 2020

Reproducible model between SetFit Versions?

Related topics