I built my own Dataset and all the features are present and I get the exact same results up until fitting the model.
There I get the above error.
After some research, it seems this is caused by not having the columns in the correct order.
The tokenizer does output it in a different order and I changed it, but neither the order in the course nor the order of the tokenizer seem to work.
Can someone think of another issue?
I donât have the Data Collator as itâs deprecated now.
Token Type Ids are commented out because the tokenizer does not return them.
Iâm using "distilbert-base-cased-distilled-squad" because I just want to try and that seems like the fastest (smallest) model.
Hey Lewis, thanks!
Iâm actually doing your book and the course in parallel.
Edit:
I did all the same preprocessing but this time switched out the model and used pytorch to train and it works. No error.
This is weird, would still love some insights to this error
Hi @ollibolli, this is a good question! Weâre thinking about a refactor of the internals of some of our TF models to make it a bit clearer, because this is one of the most common issues people encounter.
I donât -think- the order of the columns should matter. Instead, whatâs happening here is that you compiled the model with a Keras loss, but youâre passing the labels in the input dictionary. This is explained in more detail in the HF course here: How to ask for help - Hugging Face Course
If you search in that file for the âNo gradients providedâŚâ error youâll see what it is, and how to fix it. If you have any other issues, or you donât think the course notes do a good job explaining the problem, feel free to let me know!
Hi @ollibolli, thatâs probably our fault for not making it intuitive enough! The key idea is that in Keras, usually the loss is computed by a Keras loss function, which you pass to compile() in the loss argument. If you do that, then you need to pass labels in the label_cols argument, and if they arenât there, Keras wonât be able to see your labels, wonât know what to do with the loss function, and will complain that thereâs no gradients (because it couldnât compute the loss).
With Hugging Face models, though, you can also just skip providing the loss argument to compile() entirely. If you do this, the model will compute loss internally (this is really helpful in some cases, because the loss may be quite complex to specify as a Keras loss). When you do this, the labels should be in the input dictionary (like they are in your code), so that the model can see them.
tl;dr Do one of two things:
Pass a loss argument to compile() + put labels in label_cols
Donât pass a loss argument to compile() + put labels in columns
Weâre well aware that this can be unintuitive, though, and weâre working on a way to make sure the labels âjust workâ in both cases without these fiddly details.
Oh thanks a lot.
Not at all, there is always room for improvement though in general, I found the course super helpful.
Iâm doing that in tandem with Lewisâ book and a bit of Kaggle.
I find it super intuitive, just a week ago I hadnât known much and I feel pretty comfortable already.
thanks for taking the time, Iâll try your steps and will report back.
Hi, I tried both versions, but I hadnât had much luck with the TF version of my code.
I think I am not clear on what the labels are.
I thought it would be start and endpositions, but with option 2 (No loss in compile() + labels in columns) I get first the complaint that it needs input_ids, after adding that I still get the same error.
Hm, this is interesting! Would you be willing to share your whole script so I can try to reproduce it here? You can just give me a couple of rows of your dataset rather than the whole thing, or just any data with the right shape so the script runs.