Could someone please explain how to make a multi-label dataset from csv?

I have a csv file with two columns in which there are thousands of sentences (column 1, ‘sentence’) and they are marked as ‘type1’ and ‘type2’ (column 2, ‘label’). I need to build a classifier that learns to split incoming sentences into these two categories.

I tried to load the data and pass to:

from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased", num_labels=2)

df = pd.read_csv('filename.csv')
ds = Dataset.from_pandas(df)

but it never works if I set the model’s num_labels to anything other than 1. I get dimension errors. How do I specify in the dataset that there are 2 labels? (and maybe in general, how to specify that the label column is categorical, which N possible classes)

I’m really just trying to build a basic sentence classifier from my own labeled data…

Here’s an example: Transformers-Tutorials/Fine_tuning_BERT_(and_friends)_for_multi_label_text_classification.ipynb at master · NielsRogge/Transformers-Tutorials · GitHub

Thank you!