Train a Bert Classifier with more than 2 Input Text Columns

Archan · October 25, 2023, 5:53pm

Hi Guys, I am trying to train a bert based classifier for a problem that contains 2 text Columns.

The data looks like this:
Text 1 | Text 2 | Label

It is a multi-class problem. I have tried following this notebook. But I am having a hard time on using 2 columns together. If someone can point me to the correct resource that would be really helpful.

hfreedma · October 25, 2023, 7:41pm

Is your problem with the tokenizer? You can pass in two input columns in your custom tokenize function, like:

def tokenize_function(examples):
    return tokenizer(examples["text1"], examples["text2"])

I’m not sure what aspect of the classification you are struggling with. I’m somewhat of a beginner myself, so more details and code examples would be helpful to help you better.

Archan · October 26, 2023, 3:57am

Any end to end solution would be helpful because I want to understand the entire process.

hfreedma · October 26, 2023, 11:30am

It would be better if you post the code you’ve tried so far and specify the specific problem that you’re having.
These forums are usually not for tutorial-type solutions but more for getting help with specific problems.

Archan · October 27, 2023, 8:20am

I tried your way it worked.

tokenizer = AutoTokenizer.from_pretrained(model_str)
def tokenize_function(examples):
    return tokenizer(examples["ques_resp"], padding="max_length", truncation=True)
tokenized_datasets2 = dataset2.map(tokenize_function, batched=True)
tokenized_dataset_test = dataset_test.map(tokenize_function, batched=True)
small_train_dataset2 = tokenized_datasets2.shuffle(seed=42).select(range(0,3600))
small_eval_dataset2 = tokenized_datasets2.shuffle(seed=42).select(range(3600,3976))
display(small_train_dataset2)
display(small_eval_dataset2)```

Topic		Replies	Views
Questions about training bert with two columns data 🤗Transformers	0	29	September 21, 2024
Use two sentences as inputs for sentence classification 🤗Transformers	7	20237	April 21, 2022
Can I fine tune bert for a project where I have multiple text inputs and one label as output? Beginners	0	800	May 6, 2022
BERT for Dataset with two label columns Beginners	1	470	January 22, 2024
Adding additional features to BERT model Models	0	1044	July 18, 2022

Train a Bert Classifier with more than 2 Input Text Columns

Related topics