Why Tensorflow Models are way slower than Pytorch models, for autoregressive modeling?

.to(‘cuda’) is there , when I initialised. :slight_smile:
I expected a technical answer though, why tf is slower for generations.

@ huggingface team , which means tensorflow implementation are not suitable for production right as latency is higher. Can we conclude it that way?