Hello! I am fine-tuning the ‘sshleifer/distilbart-cnn-12-6’ model using the following code.
model = AutoModelForSeq2SeqLM.from_pretrained(‘sshleifer/distilbart-cnn-12-6’)
optim = AdamW(model.parameters(), lr= params_layered.lr)
for epoch in range(params_layered.epochs):
with tqdm(train_loader_l, unit=“batch”) as tepoch:
for inputs,masks,labels in tepoch:
outputs = model(inputs, attention_mask=masks, labels=labels)
loss = outputs
This produces great results. Then, I’d like to save this model and load it back. However, I cannot seem to do this correctly. First, it’s curious to note that when I print out the weights of this fine-tuned model, they are identical to the original pre-trained (before fine tuning) model weights. And indeed, when I save and load the model back, the results are the same as the model results before fine-tuning.
I am testing this using the code below:
Output: "tensor([[-0.0369, 0.0782, 0.1621, …, 0.1831, 0.0589, -0.0659], …
test = AutoModelForSeq2SeqLM.from_pretrained(‘sshleifer/distilbart-cnn-12-6’)
Output: "tensor([[-0.0369, 0.0782, 0.1621, …, 0.1831, 0.0589, -0.0659], … output matches w/ above print statement for all weights printed
I am attempting to save the weights using:
net = AutoModelForSeq2SeqLM.from_pretrained(‘sshleifer/distilbart-cnn-12-6’)
arxiv = torch.load(’/dbfs/FileStore/…/summary_model.pt’)
(All keys do match successfully)
I have also tried:
test = AutoModelForSeq2SeqLM.from_pretrained("/dbfs/FileStore/…/summary-model/")
And both of these have identical weights to the above as well. Not sure what I’m doing wrong, would appreciate any advice!