Concatenate non string features to a BERT transformers model

Yiping · March 26, 2022, 7:37am

in the pre-transformer era, it’s common to build customized models that jointly learn embeddings of these extra features and concatenate them with the input text representation. A representative paper is the follows:

Kikuchi, Yuta, et al. “Controlling Output Length in Neural Encoder-Decoders.” EMNLP. 2016

However, you can usually concatenate the extra features directly with the input text in transformers. E.g. 0.8 <sep> this is the input text. It’s important to know that Transformers doesn’t understand numbers. So you should bucket the numbers into a relatively small list of unique values so that the Transformer can learn the association between the feature and the prediction.

If you’re looking for some more complex architecture, take a look at the following paper:

Moreira, Gabriel de Souza P., et al. “Transformers with multi-modal features and post-fusion context for e-commerce session-based recommendation.” arXiv preprint arXiv:2107.05124 (2021).

Note: I’m speaking from my experience on language generation, but I think the classification problem should be able to use the features in a similar way.

Topic		Replies	Views
Options for feature addition 🤗Transformers	0	1004	November 24, 2022
How to concatenate additional features to the last layer of Bert 🤗Transformers	6	2379	January 14, 2024
How to use additional input features for NER? Beginners	27	16006	June 5, 2023
Adding additional features to BERT model Models	0	1047	July 18, 2022
Adding categorical and numerical data to Bert model 🤗Transformers	0	999	February 20, 2024

Concatenate non string features to a BERT transformers model

Related topics