I’m looking for a method for NER on semi-structured text (ie. text with bounding boxes). The challenge with NER on semi-structured text is that because of the 2D nature of the text, we cannot rely on the usual IOB tagging schema to retrieve entities.
Here’s an example where we want to extract the 2 addresses as LOC entities
In this setup, we have those labels (disregarding B-/I- since it’s not making sense anymore)
Now, if we were to treat this as plain text by sequentially looking line by line, this would give us
Here, we are mixing entities because each entity spreads across multiple lines, so retrieving entities from entity labels is not trivial.
The only solution I’ve seen is to add a subtask to group tokens into entities (treating it essentially as relation extraction).