Hi, when trying to finetune the TFLongformer using the TFTrainer, I got this error
InvalidArgumentError: 2 root error(s) found.
(0) INVALID_ARGUMENT: Incompatible shapes: [2,1024,12,514] vs. [2,1024,12,513]
[[node while/gradients/while/tf_longformer_for_sequence_classification/longformer/encoder/layer_._0/attention/self/SelectV2_4_grad/BroadcastGradientArgs_1
(defined at /usr/local/lib/python3.7/dist-packages/transformers/trainer_tf.py:633)
]]
[[while/LoopCond/_568/_14]]
(1) INVALID_ARGUMENT: Incompatible shapes: [2,1024,12,514] vs. [2,1024,12,513]
[[node while/gradients/while/tf_longformer_for_sequence_classification/longformer/encoder/layer_._0/attention/self/SelectV2_4_grad/BroadcastGradientArgs_1
(defined at /usr/local/lib/python3.7/dist-packages/transformers/trainer_tf.py:633)
This is my train configuration:
training_args = TFTrainingArguments(
output_dir='./results',
num_train_epochs=3,
per_device_train_batch_size=2,
gradient_accumulation_steps=32,
per_device_eval_batch_size=16,
logging_steps=1,
)
with training_args.strategy.scope():
model = TFLongformerForSequenceClassification.from_pretrained('allenai/longformer-base-4096', num_labels=5, return_dict=True, problem_type = "single_label_classification")
trainer = TFTrainer(model=model, args=training_args, train_dataset=train_dataset, eval_dataset=test_dataset)
Someone else had the same error when using the Tensorflow version of the Longformer.