Huggingface transformer sequence classification

I am not 100% sure, but I think you need to send the dataset and the model to device before you call the Trainer.
Something like this might work,

import torch
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')

model.to(device)
train_df_tuning_dataset_tokenized.to(device)
val_dataset_tokenized.to(device)


trainer = Trainer(
    model=model,
    args=training_args,
    compute_metrics=compute_metrics,
    train_dataset=train_df_tuning_dataset_tokenized,
    eval_dataset=val_dataset_tokenized
)