Which Bert model should we use for this problem. Next Word prediction using LM? Or Keyword Extraction problem?


I am trying to solve a text based problem. For a given text, say product description from amazon, I want to extract the brand name, type of the product, units and colour if mentioned in text.

input : Kore PVC 20-50 Kg Home Gym Set with One Plain + One Curl and One Pair Dumbbell Rods with Gym Accessories and PVC Dumbbells
output :
Kore Gym Set 50Kg

#In the output brand is Kore and type is Gym Set and Units mentioned are 50Kg

Any references to the similar problems or little bit of explanation guide on which direction should i move would be a great help.

Thanks in advance.

Maybe you can view this as a NER-like problem. That is if all the brand/type/units can directly be identified (marked) in the text. For this a BERT-model is well suited.

It depends a bit on how much labeled training data you have, and how it is organised.

Another alternative could be to view it as a summarisation problem. I would then first consider using a seq-to-seq model, like BART or T5.