I built my own Dataset and all the features are present and I get the exact same results up until fitting the model.
There I get the above error.
After some research, it seems this is caused by not having the columns in the correct order.
The tokenizer does output it in a different order and I changed it, but neither the order in the course nor the order of the tokenizer seem to work.
Can someone think of another issue?
I don’t have the Data Collator as it’s deprecated now.
Token Type Ids are commented out because the tokenizer does not return them.
I’m using "distilbert-base-cased-distilled-squad" because I just want to try and that seems like the fastest (smallest) model.
Hi @ollibolli, this is a good question! We’re thinking about a refactor of the internals of some of our TF models to make it a bit clearer, because this is one of the most common issues people encounter.
I don’t -think- the order of the columns should matter. Instead, what’s happening here is that you compiled the model with a Keras loss, but you’re passing the labels in the input dictionary. This is explained in more detail in the HF course here: How to ask for help - Hugging Face Course
If you search in that file for the “No gradients provided…” error you’ll see what it is, and how to fix it. If you have any other issues, or you don’t think the course notes do a good job explaining the problem, feel free to let me know!
Hi @ollibolli, that’s probably our fault for not making it intuitive enough! The key idea is that in Keras, usually the loss is computed by a Keras loss function, which you pass to compile() in the loss argument. If you do that, then you need to pass labels in the label_cols argument, and if they aren’t there, Keras won’t be able to see your labels, won’t know what to do with the loss function, and will complain that there’s no gradients (because it couldn’t compute the loss).
With Hugging Face models, though, you can also just skip providing the loss argument to compile() entirely. If you do this, the model will compute loss internally (this is really helpful in some cases, because the loss may be quite complex to specify as a Keras loss). When you do this, the labels should be in the input dictionary (like they are in your code), so that the model can see them.
tl;dr Do one of two things:
Pass a loss argument to compile() + put labels in label_cols
Don’t pass a loss argument to compile() + put labels in columns
We’re well aware that this can be unintuitive, though, and we’re working on a way to make sure the labels ‘just work’ in both cases without these fiddly details.
Oh thanks a lot.
Not at all, there is always room for improvement though in general, I found the course super helpful.
I’m doing that in tandem with Lewis’ book and a bit of Kaggle.
I find it super intuitive, just a week ago I hadn’t known much and I feel pretty comfortable already.
thanks for taking the time, I’ll try your steps and will report back.
Hi, I tried both versions, but I hadn’t had much luck with the TF version of my code.
I think I am not clear on what the labels are.
I thought it would be start and endpositions, but with option 2 (No loss in compile() + labels in columns) I get first the complaint that it needs input_ids, after adding that I still get the same error.
Hm, this is interesting! Would you be willing to share your whole script so I can try to reproduce it here? You can just give me a couple of rows of your dataset rather than the whole thing, or just any data with the right shape so the script runs.