I was trying to run a gridsearch on hyperparameters for the Setfit trainer, but every head generated after training seems to be identical.
Because of this issue, I am having to pickle and reload the classifier head after loading (I was saving the model performing the evaluation in a separate script).
pickle.dump(trainer.model.model_head, open(f"{OUTPUT_DIR}/model_head_backup.pkl", 'wb'))
and then in the eval script:
with open(f"{args.model_directory}/model_head_backup.pkl", 'rb') as mhf:
loaded_head = pickle.load(mhf)
model.model_head = loaded_head
Every single run is giving the same results on the test dataset, and every head file generated by the pickle function in Python is identical, according to the sha1sum
command.
This is the whole training script:
model = SetFitModel.from_pretrained("BAAI/bge-small-en-v1.5")
print("Training...")
OUTPUT_DIR = args.output
training_args = TrainingArguments(
output_dir=OUTPUT_DIR,
use_amp=GPU_AVAILABLE,
report_to="wandb",
seed=2025,
# hyperparameters
num_epochs=(args.num_epochs, 16),
l2_weight=args.l2_weight,
body_learning_rate=args.body_learning_rate,
num_iterations=1
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=test_dataset,
metric="accuracy",,
column_mapping={"text": "text", "is_hate": "label"},
)
trainer.train()
The fact I’m specifying a seed for random elements probably causes the initial weights to be the same each time but