How should I configure dataset to learn xlnet, transformer_xl in multi gpu?

If the data that continues to be connected to each other enters the model,
How can data sets be configured when performing Tensorflow GPU parallel processing to ensure continuity of data sets entering a particular gpu?