FlaxT5 vs T5X repo

In the past I experienced a huge performance difference at inference time in case of PyTorch T5 and the original T5 repo (by Google, using Mesh TensorFlow). I wandered if that was because of past_key_values.

Now I am moving to the newer T5X repo (by Google, using Flax) but noticed that a Flax based T5 is also available in HF Transformers. Is it the same code or possibly one with faster inference? May or may not due to past_key_values?