I want to use the GPT2 from huggingface transformers in tensorflow keras model definition.
input_ids = tf.keras.layers.Input(
shape=(max_len,), dtype=tf.int32, name="input_ids"
)
attention_masks = tf.keras.layers.Input(
shape=(max_len,), dtype=tf.int32, name="attention_masks"
)
gpt2 = TFGPT2LMHeadModel.from_pretrained('gpt2')
gpt2.trainable = True
#outputs = model(inputs)
output_sequences = gpt2.generate(
input_ids = input_ids,#inputs['input_ids'],
attention_mask = attention_masks, #inputs['attention_mask'],
max_length= max_len*2,
temperature=1,
top_k=0,
top_p=0.9,
repetition_penalty=1,
do_sample=True,
num_return_sequences=num_return_sequences
)
model = tf.keras.Model(inputs=[input_ids, attention_masks], outputs=output_sequences)
however, gpt2.generate
can not take input_ids and attention_masks as inputs.
The error:
TypeError: Keras symbolic inputs/outputs do not implement
__len__
. You may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model. This error will also get raised if you try asserting a symbolic input/output directly.
How can I use generate process of gpt2 in the model ?
The final goal if to calculate the loss outside, based on output_sequences
and update the parameters of the model which contains GPT2.