How can I do word classification?

I am very new to nlp so sorry for the question.
I am wondering if there a name for this task and if I can do it in deep learning or even in huggingdace :slight_smile:
Let say I have some sentences, for example:
s1: I have a dog
s2: I am new to this,
I convert my sentences to embedded features and now I have an inputs with shape of 1x2048x4 for s1 and 1x2048x5 for s2.
Now I want to classify each word to a category out of 10 categories, so the output will be like:
o1: 1x4x10
o1: 1x5x10

the dependency between the word in each sentence is important in the classification. it is kinda like token classification but the labels are not representing the token, but instead the feeling that that word will imply to the sentence.

so the length of input may change and the length of output will change as well, but classes are the same.

Can I do it with transformers and self-attention models? should I look for any specific model or task?
Sorry for the boring question, nlp newbie here with a lot of passion :slight_smile:

I’m not sure that I understood your question entirely. But classifying words like you said, would be close to a Named Entity Recognition task I think. BERT like models can produce word embeddings for input sentences, which means separate embedding vector for each word. Maybe you could use such model for your task.

1 Like

Thank you! I think you are suggesting a right path, thanks!
Do you know a good tutorial/code that I can start with to train a model?

You could lookup the Huggingface’s course too. ( Transformer models - Hugging Face Course ). There are other implementations of these models out there too. (e.g.- fairseq on Github) But HF could be relatively easier to start with. You can play with already available models on HF for NER tasks.