Hi! I’m trying to build a learning-based custom entity extraction model that is capable of extracting a specific value from a short piece of text. In other words, I have a dataset that consists of two columns: “description” and “store_number”, and I want my model to be able to extract the store_number from any description it is given. For instance:
descriptions:
[“FIVE GUYS 2565 DIST 468-981-3409 AL”,
“McDonald’s K6148”]
store_numbers:
[2565,
K6148]
and so on. I’m having a hard time figuring out what model or type of model I should train in order to accomplish this. Initially, I looked at Named Entity-Recognition and token classification, but that doesn’t seem to be the correct approach, since 1) BERT’s NER is limited to things like person, organization, etc, and 2) even if we could identify numbers with such an NER, we aren’t interested in finding the category/type of an entity, but rather correctly identifying the entity itself.
Any help on this would be appreciated–thanks!