Can I train pytorch T5 on TPU with variable batch shape?

marton-avrios · March 3, 2021, 11:12pm

My goal is to group sequences with similar length together in a batch and pad them to match the longest one. Will it work on TPU? Since it does not support dynamic shapes as the doc says.

FL33TW00D · March 5, 2021, 8:53pm

Hi @marton-avrios,
I’ve done exactly this for T5, basing it off the following article:
https://towardsdatascience.com/divide-hugging-face-transformers-training-time-by-2-or-more-21bf7129db9q-21bf7129db9e

Here’s the code:
“”"

from torch.nn.utils.rnn import pad_sequence
def collate_batch(batch):
    pad_token_id = 0
    src_ids = pad_sequence([sample['source_ids'] for sample in batch], batch_first=True, padding_value=pad_token_id)
    src_text = [sample['source_text'] for sample in batch]
    src_mask = pad_sequence([sample['source_mask'] for sample in batch], batch_first=True, padding_value=pad_token_id)

    tgt_ids = pad_sequence([sample['target_ids'] for sample in batch], batch_first=True, padding_value=pad_token_id)
    tgt_ids[tgt_ids[:, :] == 0] = -100
    tgt_mask = pad_sequence([sample['target_mask'] for sample in batch], batch_first=True, padding_value=pad_token_id)
    tgt_text = [sample['target_text'] for sample in batch]

    return {
    'source_ids': src_ids, 
    'target_ids': tgt_ids,
    'source_mask': src_mask, 
    "target_mask": tgt_mask,
    "source_text": src_text, 
    "target_text": tgt_text
    }

“”"

marton-avrios · March 6, 2021, 8:54am

Hmmm…If I understand it correctly it results in batches of variable shape so that’s good. But my concern is that TPU does not support variable tensor shapes. Each shape should be available at compile time and should not depend on input data.

Topic		Replies	Views
Does T5 truncate input longer than 512 internally? 🤗Transformers	2	12172	February 12, 2021
Dealing with multiple sequences in T5ForConditionalGeneration 🤗Transformers	0	480	May 6, 2022
How to implement Trainer's 'group_by_length' in PyTorch? Beginners	1	1760	September 25, 2023
Tutorials for using Colab TPUs with Huggingface Transformers? 🤗Transformers	16	20485	June 3, 2024
How to train T5 with Tensorflow Beginners	8	4891	October 27, 2022

Can I train pytorch T5 on TPU with variable batch shape?

Related topics