Trouble saving/loading fine-tuned BART model

Phklet · November 30, 2021, 2:44pm

Hello! I am fine-tuning the ‘sshleifer/distilbart-cnn-12-6’ model using the following code.

model = AutoModelForSeq2SeqLM.from_pretrained(‘sshleifer/distilbart-cnn-12-6’)
optim = AdamW(model.parameters(), lr= params_layered.lr)

model.train()
for epoch in range(params_layered.epochs):
with tqdm(train_loader_l, unit=“batch”) as tepoch:
for inputs,masks,labels in tepoch:
tepoch.set_description(f"Epoch {epoch}")
optim.zero_grad()
outputs = model(inputs, attention_mask=masks, labels=labels)
loss = outputs[0]
loss.backward()
optim.step()

This produces great results. Then, I’d like to save this model and load it back. However, I cannot seem to do this correctly. First, it’s curious to note that when I print out the weights of this fine-tuned model, they are identical to the original pre-trained (before fine tuning) model weights. And indeed, when I save and load the model back, the results are the same as the model results before fine-tuning.

I am testing this using the code below:

(fine-tuned model)
print(dict(model.named_parameters()))
Output: "tensor([[-0.0369, 0.0782, 0.1621, …, 0.1831, 0.0589, -0.0659], …

test = AutoModelForSeq2SeqLM.from_pretrained(‘sshleifer/distilbart-cnn-12-6’)
print(dict(test.named_parameters()))
Output: "tensor([[-0.0369, 0.0782, 0.1621, …, 0.1831, 0.0589, -0.0659], … output matches w/ above print statement for all weights printed

I am attempting to save the weights using:

torch.save(model.state_dict(),’/dbfs/FileStore/…/summary_model.pt’)
net = AutoModelForSeq2SeqLM.from_pretrained(‘sshleifer/distilbart-cnn-12-6’)
arxiv = torch.load(’/dbfs/FileStore/…/summary_model.pt’)
net.load_state_dict(arxiv)

(All keys do match successfully)

I have also tried:
model.save_pretrained("/dbfs/FileStore/…/summary-model/")
test = AutoModelForSeq2SeqLM.from_pretrained("/dbfs/FileStore/…/summary-model/")

And both of these have identical weights to the above as well. Not sure what I’m doing wrong, would appreciate any advice!

Phklet · December 1, 2021, 9:34pm

This actually resolved itself!

Topic		Replies	Views
Loading and saving a model Beginners	2	12592	September 14, 2024
What can cause model.generate (BART) output to be gibberish after fine-tuning? Beginners	3	4429	August 31, 2020
Can't save my finetuned model Beginners	5	211	November 9, 2024
Loading weights of BART model into a different architecture Models	0	389	December 29, 2021
Unable to load saved fine tuned tensorflow model 🤗Transformers	0	1777	July 25, 2022

Trouble saving/loading fine-tuned BART model

Related topics