Accuracy is stagnant

Hello … I am following the course but using a different dataset from load_dataset and slight mods.
When I run this code, the accuracy remains constant. I am expecting in the best scenario the accuracy to improve, if not have some variation. But it remains constant between each epoch. Any idea?

from tqdm.auto import tqdm
progress_bar = tqdm(range(num_epochs*num_steps))

for epoch in range(num_epochs):
    model.train()
    for batch in train_dl:
        model_inputs = {k:v.to(device) for k, v in batch.items()}
        outputs = model(**model_inputs)
        loss = outputs.loss
        loss.backward()
        optimizer.step()
        lr_scheduler.step()
        optimizer.zero_grad()
        progress_bar.update(1)
    
    model.eval()
    metric = load_metric('accuracy')
    for batch in eval_dl:
        model_inputs = {k:v.to(device) for k, v in batch.items()}
        with torch.no_grad():
            outputs = model(**model_inputs)
        logits = outputs.logits
        predictions = torch.argmax(logits, dim=-1)
        metric.add_batch(predictions=predictions, references=model_inputs['labels'])
    print(metric.compute())

Output is:

{‘accuracy’: 0.2112}
{‘accuracy’: 0.2112}
{‘accuracy’: 0.2112}

This definitely shows your model is not training. A few things to check are:

  • maybe the learning rate is too high/too low?
  • maybe there is some problems in your labels and the model can’t learn?

Yep! you were correct … looked like my learning rate was too high - i originally set it to 1e-3 but then changed to 5e-5 and that seems to work:

{‘accuracy’: 0.341}
{‘accuracy’: 0.3574}
{‘accuracy’: 0.3592}

Thanks for the pointer Sylvain!