I am going to do Opinion Mining on twitter posts. According to the hashtags, users who are against the topic mostly use some specific hashtags and also users who are with that topic use other hashtags.
Can we give more importance to these hashtags (weight up)? First, is that a good idea and second is that possible to do it in BERT tokenizer?
Hi Mahdi,
My guess is that with enough training data, the transformer model, and in particular its attention heads, will learn to recognise what they should be paying most attention to, i.e. which parts of the text are more important for the classification, and which parts are not that relevant, so this will happen implicitly, so long as the model has enough data. I’m not aware of a way to explicitly force BERT to weight some tokens more than others, however I’d be happy to be proven wrong by other contributors if this is the case.
1 Like
Actually I thought the same thing but like as most ML issues I do not have enough amount of labeled data. thanks for your help.