Sliding Window Approach for Multilabel Classification

slyle · July 21, 2023, 3:05am

I would like to adapt the sliding window approach in this example for multilabel classification:
https://github.com/huggingface/notebooks/blob/main/examples/question_answering.ipynb

The way I envision it:

if example is over max token limit, the example is broken into parts using sliding window with overlap
each chunk is run through model, receiving predicted labels
all labels from each chunk are grouped back together, removing duplicates, leaving us with 1 predicted label
that concatenated answer is evaluated with loss function
backprop according to loss function

Is something like this possible? A challenge I see here is that all chunks from the parent example must fit inside of one batch, otherwise loss will be calculated with partial examples (i.e. with batch size = 4, but a given example is 5 chunks long, chunks 1-4 will be evaluated and chunk 5 will either be truncated or worse, pushed into the next batch). I am tempted to simply use Longformer or BigBird and truncate at 4096, but if my document is 10k tokens long, I run into the same issue.

Thanks!

Topic		Replies	Views
Sliding Window - Multilabel Classification Beginners	0	372	July 25, 2023
Handling long text in BERT for Question Answering Beginners	7	11932	March 10, 2022
Sequence Classification Long Documents Beginners	1	543	June 9, 2022
Multi-label token classification 🤗Transformers	34	7703	September 6, 2023
Multi label classification with large number of labels and sparse data 🤗Transformers	1	1526	July 15, 2023

Sliding Window Approach for Multilabel Classification

Related topics