Grouping Tokens after Token Classification

hasansalimkanmaz · January 6, 2022, 9:21am

Is there a way to group tokens after token classification via HF? I see something similar in Rasa. However, I am not sure it is the best way to do it as they are giving group numbers to the model to train on. However, If a document contains more groups than the documents in the training data, the RASA implementation fails.

I think I am looking for a solution like (kinda supervised) clustering which is independent of the number of groups in the documents.

nielsr · January 6, 2022, 2:41pm

Hi,

The token classification pipeline has the ability to group tokens, as seen here.

The front-facing API is an “aggregation strategy”. See the docs for more info.

Topic		Replies	Views
XLSR-53: To group tokens or not to group tokens Research	1	548	March 18, 2021
Handling tokenization effects of punctuated numbers in NER (e.g. $10,000) 🤗Transformers	2	1350	March 30, 2023
Token classification for a non-textual data 🤗Transformers	0	437	March 5, 2023
TokenClassification vs SequenceClassification Beginners	3	4023	March 16, 2021
Tokens in vector space Intermediate	0	471	March 24, 2022

Grouping Tokens after Token Classification

Related topics