Accelerate FSDP training || RuntimeError : Forward oder differ across ranks

While training gpt2-large model with wikitext dataset using accelerate fsdp configuration , below error is being seen

RuntimeError: Forward order differs across ranks: rank 0 is all-gathering 1 parameters while rank 3 is all-gathering -241381119 parameters