Not able to overfit a transformer model on my data

I have a very bursty timeseries for the forecasting purpose.
purposely I am trying to overfit on a small subset of data. I took 30 sample trajectories of length 150 to train the transformer model on. I am using a transformer with more than 10Million parameters, however the model doesn’t overrfit.
data.shape=30,150
in each trajectory the first 75 points will be shown to the model and the model should predict the next 75 steps (75 step ahead forecasting). I am using adam optimizer and mse loss
Do you have any idea why I can’t overfit?