Use this topic to ask your questions to Matthew Carrigan during his talk: New TensorFlow Features for Transformers and
Datasets
Could the notebook shown in the video be linked?
Sure thing, here’s a Colab link!
For something like a zero-shot model, how does TFAutomodelforSequenceClassification change?
For something like a zero-shot model, how does TFAutomodelforSequenceClassification change?
This is answered at 50:06 in the main stream.
I understand that padding enables batching data points together, but too many padding tokens make the computations very slow (?). What is the attention_mask for? If I understand correctly, it masks the padding tokens so that the model does not pay attention to it - but do padding tokens then still slow down the training if they are masked? I haven’t fully understood the purpose of padding and attention_masks and the impact on speed.
This is answered at 55:00 on the main stream.
Hi all, just noticed the Colab notebook didn’t have permissions set. It should be accessible now!