Xlm-roberta-base predicting always same class, other models don't

Hi all,
I’m currently trying to fine-tune xlm-roberta-base for a binary classification task, using a pretty standard code:

    [...]
    data_files = {'train': train_path, 'test': test_path}
    model_name = 'xlm-roberta-base'
    tokenizer = AutoTokenizer.from_pretrained(model_name)

    train_dataset = create_dataset(data_files['train'], tokenizer, shuffle=True) #just load dataset from file and tokenizes text
    test_dataset = create_dataset(data_files['test'], tokenizer)

    model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)
    metric = evaluate.load("accuracy")

    def compute_metrics(eval_pred):
        logits, labels = eval_pred
        predictions = np.argmax(logits, axis=-1)
        return metric.compute(predictions=predictions, references=labels)

    training_args = TrainingArguments(output_dir=output_dir, num_train_epochs=5, 
     per_device_train_batch_size=16, per_device_eval_batch_size=16, data_seed=42, 
     logging_dir='logs', logging_strategy='epoch', save_strategy='no', 
     evaluation_strategy='epoch')

    trainer = Trainer(model=model, args=training_args, train_dataset=train_dataset, 
     compute_metrics=compute_metrics, eval_dataset=test_dataset)

    trainer.train()
    trainer.save_model()
    [...]

I’m using a balanced dataset with 50% examples of class 0 and 50% of class 1. On the evaluation set, the model always predict one class. If I change model to another, for example “bert-base-cased” it reaches an accuracy of 92%. What am I missing?

Thanks in advace!