Fine-Tuning BERT Question Answering sequence output problem

lseongjoo · July 2, 2021, 2:26am

While following instructions on Fine-tuning with custom datasets — transformers 4.7.0 documentation using TensorFlow Keras, model fit produces below problem and fail to start training

from transformers import TFAutoModelForQuestionAnswering

model = TFAutoModelForQuestionAnswering.from_pretrained("bert-base-multilingual-cased")

...

model.fit(...)

TypeError: The two structures don't have the same sequence type. Input structure has type <class 'tuple'>, while shallow structure has type <class 'transformers.modeling_tf_outputs.TFQuestionAnsweringModelOutput'>.

I suspect that it is related to formatting output for Keras requirements as below:

# Keras will expect a tuple when dealing with labels
train_dataset = train_dataset.map(lambda x, y: (x, (y['start_positions'], y['end_positions'])))

since the error message is about output and it says tuples are not transformers.modeling_tf_outputs.TFQuestionAnsweringModelOutput.

Help?

My platform is Windows 10 and libraries are

print(tf.__version__)
print(torch.__version__)
print(transformers.__version__)

2.4.0
1.9.0+cu111
4.8.2

dev-sajal · August 10, 2021, 1:16pm

The error here was because they asked to use return_dict = False but in TF during model compilation, we have to set run_eagerly=True in order to actually make sense of the return_dict parameter.
According to documentation:

return_dict (bool, optional) – Whether or not to return a ModelOutput instead of a plain tuple. This argument can be used in eager mode, in graph mode the value will always be set to True.

Actually it was given in warning that it couldn’t set return_dict to False and hence I was able to figure out this issue.
I don’t know its equivalent for PyTorch but I hope this helps.

jasen · August 16, 2021, 4:31pm

Hi [lseongjoo],

I am facing the same issue. Have you resolved it?
thank you!

jasen · August 16, 2021, 4:32pm

Hi dev-sajal,

I tried this but this is not working.

emekliasker · August 26, 2021, 2:39pm

Hi everyone, I finally found a solution.

Do not use this line

# Keras will expect a tuple when dealing with labels
train_dataset = train_dataset.map(lambda x, y: (x, (y['start_positions'], y['end_positions'])))

in graph mode, the model returns the prediction as “TFQuestionAnsweringModelOutput” and this model is inherited from a dictionary! So we need to use the “y” value as a dictionary too.

Replace every “start_positions” with “start_logits” and every “end_positions” with “end_logits”.

The reason is when you calculate loss
TFQuestionAnsweringModelOutput has “start_logits” and "end_logits"

but

y_true has “start_positions” and “end_positions”.

After the loss calculation, tensorflow adds “start_logits” and “end_logits” keys to the y_true and throws an error about length of the y_pred and y_true is different. When you replace start_positions with start_logits etc. problem solved!

NOTE: When you apply step 2, your training code will not work on PyTorch training You need to set the dictionary keys as “start_positions” and “end_positions”.

Topic		Replies	Views
New model output types 🤗Transformers	7	5667	March 11, 2021
Unexpected keyword argument 'return_dict' in BertForSequenceClassification Beginners	1	2950	October 29, 2020
TF transformers model inputs and outputs showing none? 🤗Transformers	1	1095	April 25, 2022
Question about supported framework Beginners	2	335	June 18, 2021
AutoModelForQuestionAnswering : TypeError: __init__() got an unexpected keyword argument 'return_dict' Models	2	2373	November 13, 2020

Fine-Tuning BERT Question Answering sequence output problem

Related topics