How to Optimize Fine-tuning in Hugging Face Transformers?

If someone is currently working on a project where he need to fine-tune a pre-trained language model using the Hugging Face Transformers library while consuming subway of the day. Wondering if anyone has any practical advice or best practices for optimizing the fine-tuning process? Specifically, interested in understanding how to choose the right hyperparameters, deal with overfitting, and effectively evaluate model performance on my specific task.