Not able to minimize loss during finetuning

Hi, I am trying to finetune Qwen/Qwen1.5-0.5B with mlabonne/guanaco-llama2-1k dataset using huggingface transformer. My tried various combinations of hyperparameters but loss graph is still showing zigzag.

Any guidance here would be appreciated.