Hi, I have a Hugging Face dataset that I would like to convert to a tf.data.Dataset using
The Keras function
model.fit() can take a tuple of the format
(inputs, targets, sample_weights). However, I do not see an option for choosing the weight column when using
How do i get the tf.data.Dataset to be in the format
(inputs, targets, sample_weights)?
Hi @9las, this isn’t natively supported, but you can make it work pretty easily! The key idea is that the output of
to_tf_dataset() is a
tf.data.Dataset. This means that you can apply all the standard techniques and transformations to it.
For example, one way to handle this would be:
- Add a column to your dataset for
to_tf_dataset() to convert your dataset to
tf.data.Dataset and make sure you retain that column
- Add a transformation at the end using
dataset.map() that extracts the
sample_weight key from the input and adds it as a third column. For example, something like this:
def extract_sample_weight(inputs, labels):
weight = inputs.pop("sample_weight")
return inputs, labels, weight
dataset = dataset.map(extract_sample_weight)
As long as your dataset outputs tuples with three elements, Keras will recognize the third one as the sample weight.