How do I do inference using the GPT models on TPUs?

The pipeline function does not support TPUs, you will have to manually pass your batch through the model (after placing it on the right XLA device) and then post-process the outputs.

1 Like