Hi, I was going through the documentation and got a confusion
trainer = Trainer(
model=model, # the instantiated Transformers model to be trained
args=training_args, # training arguments, defined above
train_dataset=train_dataset, # training dataset
eval_dataset=test_dataset # evaluation dataset
)
I couldn’t understand what is the type of train_dataset and how the target for loss calculation is selected.
In Fine-tuning in native TensorFlow 2 also there is no target value. Am I missing something?
model.fit(train_dataset, epochs=2, steps_per_epoch=115)
Thank you
For more context, he/she is talking about this page: https://huggingface.co/transformers/training.html
I also got confused by this bit of the documentation, but I think this code expects datasets like the ones provided by Hugging Face’s NLP package.
I think they are all based on Pytorch’s Dataset Class, but I could be mistaken.
Try to use one of the datasets provided by their NLP package and check if it works correctly.
Hope this helps!
1 Like
Hi @suyash21 this post has some explanation about the dataset expected by Trainer
1 Like
Trainer
is to be used with PyTorch, so in this case the train_dataset
needs to be a PyTorch dataset. TFTrainer
would expect a TF dataset. The doc page is a bit unclear (types are right in the signature but too short/wrong in the enumeration). I’ll send a fix to this later today.