Include more features per token while training the BERT model

gsarda · August 21, 2024, 1:34pm

My data is in this format for representing journey of one user per row.

Item 1, Item 2, Item 3, Item 7, Item 10
Item Description 1, Item Description 2…Item Descripiton 10
Item 1 purchased on Monday, Item 2 purchased on Tuesday, …Item 10 purchased on Tuesday…
Item 1 Price, Item 2 Price…Item 10 Price

Consider for 10 users, you shall have 10 such rows of journey which has Item list, Item Description list, Item Purchased day list, Item price list.

Currently, we have trained MLM on data of all users with only item list using BertForMaskedLM from huggingface transformers library
Journey is transformed to: Item 1, Item 2, Item 3, [MASK], Item 10
And then in inferencing we try to predict the next item the user should select by providing his history to the model.

Now, we want to include description, price, day of purchase etc as more features. These features are per token and not per user.

Need help to understand the architecture and code changes for custom BERT library.

Topic		Replies	Views
Add Custom Token-Level Features 🤗Transformers	0	299	April 8, 2022
Adding additional features to BERT model Models	0	1041	July 18, 2022
How to customize BERT MLM task Beginners	6	1783	September 27, 2023
How to use additional input features for NER? Beginners	27	15962	June 5, 2023
How to "further pretrain" a tokenizer (do I need to do so?) 🤗Tokenizers	5	4387	February 20, 2022

Include more features per token while training the BERT model

Related topics