falcon-40B inference on older version of torch

I am wondering if there should be any issue in terms of accuracy with running Falcon-40B on pytorch versions older than 2.x (from which the torch.compile and flash-attn was added to torch)? I am seeing that by even changing the attention to use the normal GeMMs and softmax to compute the attention score and context, the accuracy becomes terrible, and the model generates rubbish! Has anyone else has similar experience or is there any reason why this happens?