Asymmetric Loss Function has no effect in Accelerate

sriramgs · October 13, 2024, 12:11pm

Hi,

I am trying to train a multi label classification model using Accelerate’s parallel training process. When I tried with pytorch’s BCELogitsLoss(), loss value decreases gradually and macro AUROC starts to improve as we go along (best AUROC = 0.60). But when I tried to use custom loss function like ASL, I don’t see any learning happening. Loss function can be found in this link

I ran the training script multiple times by changing the hyperparameters of loss function. But to my surprise at all the times, the macro AUROC value follows the same pattern and its values match exactly at respective epoch in all these runs (max. AUROC = 0.49 at epoch 2). From other posts, I can see the ASL loss performs better that conventional loss functions. But in all these cases, they have used pytorch’s distributed learning. I am sure that I am missing some key thing here which makes my model’s performance suffer. It would be helpful, if someone can share their valuable insights in this regard.

My code flow:

from losses import AsymmetricLoss #custom loss

model = Resnet50()
model = nn.SyncBatchNorm.convert_sync_batchnorm(model)
optimiser = torch.optim.SGD(model.parameters(), lr=0.0001)
criterion = AsymmetricLoss (gamma_neg=5, gamma_pos=1) # hyperparamters

accelerator = Accelerator()
model, optimiser, trainloader = accelerator.prepare(model, optimiser, trainloader)

for epoch in range(100):
for i, batch in enumerate(trainloader):
img, label = batch[0], batch[1]
y_pred = model(img)
loss = criterion(y_pred, label.float())
training_loss.update(loss.item(), batch[0].size(0))
optimiser.zero_grad()
accelerator.backward(loss)
optimiser.step()

Topic		Replies	Views
Question about loss computing in training masked-language-model Beginners	0	327	March 17, 2022
Question about calculating training loss of multi-GPU with Accelerate 🤗Accelerate	1	873	July 20, 2024
TFOpenAIGPTDoubleHeadsModel Loss Function Beginners	0	237	August 6, 2020
Predictions have similar logits for the entire test dataset with little variance, when trained using Custom Trainer 🤗Transformers	0	423	January 6, 2023
12% into epoch training loss drops to 0.0 Intermediate	2	646	March 6, 2024

Asymmetric Loss Function has no effect in Accelerate

Related topics