Background
I’m trying to train a model in Tensorflow to classify text according to a fixed set of 5 labels. For example, let’s say I feed my model the following text:
“my advice is that you go ahead with your plans to learn Python, because its syntax is easy for beginners. It’s also great for snake lovers like me!”
After sniffing the text, the model would, ideally, report back how much the text matches my pre-defined labels:
Label Prediction
-------------------- ----------
programming_advice 0.99
advice_for_beginners 0.91
cooking_advice 0.11
health_advice 0.10
not_advice 0.01
My question
What is the most efficient way to build such a classifier? I’ve seen several options to do this, but I’m not sure which one would be best:
- Fine-tune five different binary classifiers, since there are five labels… but this would take forever to train, so I assume there must be a better way.
- Make a model with a transformer only, and train it.
- Make a model with a transformer plus my own Dense layers, and train it.
- Seen in this sample notebook, which was linked in the transformers documentation.
- Make a model with a transformer plus my own Dense layers—but freeze the transformer as-is, and only train the Dense layers.
- Freezing is a common practice with pre-trained computer vision models; I don’t know whether it’s also good practice for NLP too.
I would be grateful for any suggestions on which of 1-4 works best. I’m still rather new around here, but the Huggingface community is extremely welcoming and helpful, and I appreciate being here! A big thanks for anybody who can help give me some pointers.