Compatibility Issue of Transformers Library with TensorFlow 2.18

PabloSandler · November 11, 2024, 8:41pm

I’m encountering a compatibility issue between the transformers library and TensorFlow 2.18 during training with the TFLongformerForQuestionAnswering model. The setup includes TensorFlow 2.18, Transformers (latest version as of November 2024), and a custom model head for Q&A fine-tuning.

During training, I receive the following error:

tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:
required broadcastable shapes
[[node tf_longformer_for_question_answering/longformer/encoder/layer_._0/attention/self/dropout_1/dropout/SelectV2]]

The error persists despite reducing batch size, disabling dropout, and updating to the latest transformers library version. It seems to originate within the Longformer’s attention mechanism, potentially due to shape incompatibilities or dropout inconsistencies specific to TensorFlow 2.18.

Troubleshooting Steps Taken

Verified input and attention mask shapes, ensuring compatibility with the model’s expected (batch_size, sequence_length) dimensions.
Removed dropout layers and tried varying batch_size settings.
Updated the transformers library and TensorFlow to the latest versions.

Is there an official statement or ongoing update addressing compatibility issues between TensorFlow 2.18 and the transformers library?

sstevemmitchell · November 12, 2024, 5:49am

The error you’re encountering often suggests a mismatch in tensor shapes during operations that require broadcasting. Since you’ve already confirmed the input shapes and tried various configurations, this might be a deeper compatibility issue between TensorFlow 2.18 and the transformers library.

While there might not be an official statement specifically addressing this issue, TensorFlow and Hugging Face are constantly working on improving compatibility. It’s possible that this is a known issue being worked on for future releases.

In the meantime, consider checking GitHub issues for both TensorFlow and the transformers library. Developers and users often report such issues there, and you might find workarounds or patches shared by the community. Additionally, if possible, try running your setup with an earlier version of TensorFlow (e.g., 2.17) to see if the problem persists, as this might help identify if it’s a version-specific issue. If all else fails, reaching out directly to Hugging Face’s support or community forums may provide additional insights or solutions.

PabloSandler · November 12, 2024, 10:57am

Thanks, Steve, for the clarifications. So far, have not found any issue reported in GitHub regarding this broadcastable shapes. Or issues broadly related with attention dropouts.

After checking the versions, I see that both TensorFlow 2.17 and TensorFlow 2.18 run with the same Transformers library version, 4.46.2. So, it appears that compatibility between Transformers and TensorFlow 2.18 has not been fully addressed yet.

To assist others with the same issue, I was able to control the problem by reducing the depth size at certain points in the model architecture, which seemed to cause the issue.

Topic		Replies	Views
TFLongformer Shape Error 🤗Transformers	2	679	December 31, 2021
TensorFlow trainer 🤗Transformers	1	1005	May 26, 2021
Looks like the new transformer 4.49.0 has some issues 🤗Transformers	3	235	March 6, 2025
Issue with "input_ids must be smaller than the embedding layer's input dimension (got 50257 >= 50257)" Beginners	1	1174	June 22, 2023
Transformers and TensorFlow Extended (TFX) 🤗Transformers	0	1001	February 17, 2021

Compatibility Issue of Transformers Library with TensorFlow 2.18

Troubleshooting Steps Taken

Related topics