Why Tensorflow Models are way slower than Pytorch models, for autoregressive modeling?

I mean for inference. Here is the snippet. Something is wrong.
@thomwolf - Any thoughts? Thanks…