Thanks John for pointing the code. I am using gather_from_metrics() in my code. Is there any way to solve this issue? I have added a snippet of my evaluation function.
def evaluate(self, model, criterion, dataloader):
losses = AverageMeter('loss', ':.4f')
accuracy= AverageMeter('acc', ':.4f')
model.eval()
with torch.no_grad():
for i, batch in enumerate(dataloader):
img, label = batch[0], batch[1]
y_pred = model(img)
loss = criterion(y_pred, label.float())
losses.update(loss.item(), batch[0].size(0))
outputs, targets = accelerate.gather_for_metrics((y_pred, label))
accuracy.update(binary_accuracy(outputs, targets).item(), batch[0].size(0))
I’m having a similar message but without using Accelerate:
2024-12-26 13:10:06,872 - INFO - The used dataset had no length, returning gathered tensors. You should drop the remainder yourself.
Does anyone know how to get rid of this with preserving dataloader_drop_last=True in the TrainingArguments?
The funny thing is that I’m using the HuggingFace’s Dataset class and it has __len__() (link). Moreover, my compute_metrics() and preprocess_logits_for_metrics() both show that the last batch is indeed dropped, so everything seems fine except for the message.