Hi
I’m a HuggingFace Newbie and I’m trying to fine tune DistilBERT for a three label sentiment classification task.
To do so I am using as a guide the HuggingFace Course. Hence I am using the following code to train my model:-
model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=3)
lr_scheduler = PolynomialDecay(
initial_learning_rate=5e-5,
end_learning_rate=0.,
decay_steps=num_train_steps
)
opt = Adam(learning_rate=lr_scheduler)
model.compile(optimizer=opt, loss=loss, metrics=['accuracy', F1_metric()])
model.fit(
encoded_train,
np.array(y_train),
validation_data=(encoded_val, np.array(y_val)),
batch_size=8,
epochs=3
)
The loss function is:-
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
The number of training steps is calculated like so:-
batch_size = 8
num_epochs = 3
num_train_steps = (len(encoded_train['input_ids']) // batch_size) * num_epochs
So far then, very much like the boiler-plate code in the course.
My encoded training data looks like this:-
{'input_ids': <tf.Tensor: shape=(1040, 512), dtype=int32, numpy=
array([[ 101, 155, 1942, ..., 0, 0, 0],
[ 101, 27900, 7641, ..., 0, 0, 0],
[ 101, 155, 1942, ..., 0, 0, 0],
...,
[ 101, 109, 7414, ..., 0, 0, 0],
[ 101, 2809, 1141, ..., 0, 0, 0],
[ 101, 1448, 1111, ..., 0, 0, 0]],
dtype=int32)>, 'attention_mask': <tf.Tensor: shape=(1040, 512), dtype=int32, numpy=
array([[1, 1, 1, ..., 0, 0, 0],
[1, 1, 1, ..., 0, 0, 0],
[1, 1, 1, ..., 0, 0, 0],
...,
[1, 1, 1, ..., 0, 0, 0],
[1, 1, 1, ..., 0, 0, 0],
[1, 1, 1, ..., 0, 0, 0]], dtype=int32)>}
Printing with y_train.head()
my labels look like this (though my code turns this into a numpy array):-
10 2
147 1
342 1
999 3
811 3
Name: sentiment, dtype: int64
I am receiving the following error message:-
Epoch 1/3
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-56-2902befb3adf> in <module>()
16 validation_data=(encoded_val, np.array(y_val)),
17 batch_size=8,
---> 18 epochs=3
19 )
14 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/type_spec.py in __make_cmp_key(self, value)
381 raise ValueError("Unsupported value type %s returned by "
382 "%s._serialize" %
--> 383 (type(value).__name__, type(self).__name__))
384
385 @staticmethod
ValueError: Unsupported value type BatchEncoding returned by IteratorSpec._serialize
My code is being run in Google Collaboratory using GPUs.