How to use additional input features for NER?

I didn’t check it completely, but my first thought was to concatenate additional data to the pooled output of transformer.

My second thought is to create two seperate classification layers, first one takes pooled output, second takes the first layers input and additional features and gives logits.

1 Like