Multilabel multiclass audio classification


I have managed to adapt the audio classification tutorial to my own dataset:

I can now fine-tune a wav2vec model on my dataset. I am currently fine tuning a classifier on the sentiment label.

However, the dataset contains 6 other labels for emotion:

Each label, can have up to 15 different classes.

The question is how to train a model using the six emotion labels as target simultaneously.

Would it be possible to group all six labels as a list or an array and use that as a single target?

I have found a few old posts and articles providing some pointers but I am not sure how up to date they are and I do not really understand the proposed solutions.

Any hints would be most appreciated.