Adding new features to Bert for NER

Our goal is to train Bert for NER using a json dataset much like Conll2007 but with additional features:

  • Distance from top of document (top)
  • Distance from left edge of document (left)
    Note that I don’t want to do this with embeddings because I think using these values in training will get much better results. It seems like it should be easy to add the features but nothing seems to work. so this link is not what I am looking for.

I created a new object for BertForTokenClassification with a new forward(). The new forward() has 4 additional parameters (pos, chunk, top, left) and I also added these to the outputs call:

class AeepBertForTokenClassification(BertForTokenClassification):
    def forward(
        self,
        input_ids=None,
        attention_mask=None,
        token_type_ids=None,
        position_ids=None,
        pos=None,
        chunk=None,
        top=None,
        left=None,
        head_mask=None,
        inputs_embeds=None,
        labels=None,
        output_attentions=None,
        output_hidden_states=None,
        return_dict=None,
    ):
        r"""
        labels (:obj:`torch.LongTensor` of shape :obj:`(batch_size, sequence_length)`, `optional`):
            Labels for computing the token classification loss. Indices should be in ``[0, ..., config.num_labels -
            1]``.
        """
        return_dict = return_dict if return_dict is not None else self.config.use_return_dict

        outputs = self.bert(
            input_ids,
            attention_mask=attention_mask,
            token_type_ids=token_type_ids,
            position_ids=position_ids,
            pos=pos,
            chunk=chunk,
            top=top,
            left=left,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
            return_dict=return_dict,

        )

        sequence_output = outputs[0]

        sequence_output = self.dropout(sequence_output)
        logits = self.classifier(sequence_output)

        loss = None
        if labels is not None:
            loss_fct = CrossEntropyLoss()
            # Only keep active parts of the loss
            if attention_mask is not None:
                active_loss = attention_mask.view(-1) == 1
                active_logits = logits.view(-1, self.num_labels)
                active_labels = torch.where(
                    active_loss, labels.view(-1), torch.tensor(loss_fct.ignore_index).type_as(labels)
                )
                loss = loss_fct(active_logits, active_labels)
            else:
                loss = loss_fct(logits.view(-1, self.num_labels), labels.view(-1))

        if not return_dict:
            output = (logits,) + outputs[2:]
            return ((loss,) + output) if loss is not None else output

        return TokenClassifierOutput(
            loss=loss,
            logits=logits,
            hidden_states=outputs.hidden_states,
            attentions=outputs.attentions,
        )

This results in the error message:

Running tokenizer on prediction dataset: 100%|██████████| 26/26 [00:03<00:00,  6.96ba/s]
[INFO|trainer.py:540] 2021-11-04 09:00:04,306 >> The following columns in the training set  don't have a corresponding argument in `AeepBertForTokenClassification.forward` and have been ignored: pos_tags, id, tokens, ner_tags, chunk_tags.
[INFO|trainer.py:1196] 2021-11-04 09:00:04,334 >> ***** Running training *****
[INFO|trainer.py:1197] 2021-11-04 09:00:04,334 >>   Num examples = 1287
[INFO|trainer.py:1198] 2021-11-04 09:00:04,334 >>   Num Epochs = 3
[INFO|trainer.py:1199] 2021-11-04 09:00:04,334 >>   Instantaneous batch size per device = 8
[INFO|trainer.py:1200] 2021-11-04 09:00:04,334 >>   Total train batch size (w. parallel, distributed & accumulation) = 8
[INFO|trainer.py:1201] 2021-11-04 09:00:04,334 >>   Gradient Accumulation steps = 1
[INFO|trainer.py:1202] 2021-11-04 09:00:04,334 >>   Total optimization steps = 483
  0%|          | 0/483 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/pydevd.py", line 1483, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/Applications/PyCharm.app/Contents/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/Users/cccc/PycharmProjects/aegis-ml/aegis-ml-ner-trainer/mainSingleThread2.py", line 702, in <module>
    main()
  File "/Users/cccc/PycharmProjects/aegis-ml/aegis-ml-ner-trainer/mainSingleThread2.py", line 610, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "/Users/cccc/PycharmProjects/aegis-ml/venv-source/lib/python3.8/site-packages/transformers/trainer.py", line 1316, in train
    tr_loss_step = self.training_step(model, inputs)
  File "/Users/cccc/PycharmProjects/aegis-ml/venv-source/lib/python3.8/site-packages/transformers/trainer.py", line 1849, in training_step
    loss = self.compute_loss(model, inputs)
  File "/Users/cccc/PycharmProjects/aegis-ml/venv-source/lib/python3.8/site-packages/transformers/trainer.py", line 1881, in compute_loss
    outputs = model(**inputs)
  File "/Users/cccc/PycharmProjects/aegis-ml/venv-source/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/cccc/PycharmProjects/aegis-ml/aegis-ml-ner-trainer/mainSingleThread2.py", line 87, in forward
    outputs = self.bert(
  File "/Users/cccc/PycharmProjects/aegis-ml/venv-source/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
TypeError: forward() got an unexpected keyword argument 'pos'
python-BaseException
[ERROR|tokenization_utils_base.py:954] 2021-11-04 10:20:38,330 >> Using bos_token, but it is not set yet.
[ERROR|tokenization_utils_base.py:964] 2021-11-04 10:20:38,338 >> Using eos_token, but it is not set yet.

Self.bert() seems to be a pyTorch module object (module.py) which, of course, does not know that we have added features (pos, etc) but it seems to expect only kwargs and not expect specific parameters. So I don’t know what I need to do to pass these new parameters on through forward(). I can’t tell for sure what exactly it is calling when I trace it.

Can someone help me with adding these features?

1 Like

Any progress on this?
You must update the forward function of the BertModel, in modeling_bert.py.