The Trainer API does support TPUs. For example, the language modeling examples can be run on TPU. There’s one thing to take into account when training on TPUs:
Note: On TPU, you should use the flag
--pad_to_max_length
in conjunction with the--line_by_line
flag to make sure all your batches have the same length.
You can take a look at the scripts for details.