Text classifier is trained incorrectly using BERT transformers (f1 = 0) for a certain amount of dataset

johntheripper43 · May 22, 2022, 8:25pm

Hello!
I have a data set (151008 sentences) and only 2 classes (labels).
I wrote a sentence classifier using AutoModelForSequenceClassification and Huggingface Course and I have the following results:
cointegrated/rubert-tiny2 - F1=0.9708
DeepPavlov/rubert-base-cased - F1=0.967
DeepPavlov/rubert-base-cased-conversational - F1=0.9283

I expected to get such results.
BUT! When I use other models (with dataset = 151008 sentences), I get the following results:
sberbank-ai/sbert_large_nlu_ru - F1=0.0
bert-base-multilingual-cased - F1=0.0

However, if I use ¼ of the dataset (37752 sentences), I get adequate results. I used both the implementation through the Trainer and through the train loop.
Please tell me what I’m doing wrong and how to train the model on a full dataset?
I perform training in the cloud (yandex cloud), JupiterLab environment, 1x V100.
Code:

path = '/home/jupyter/work/resources/Datasets/dataset_raw'
raw_datasets = DatasetDict.load_from_disk(path)
#Tokenize
checkpoint = "bert-base-multilingual-cased"#"DeepPavlov/rubert-base-cased"#'sberbank-ai/sbert_large_nlu_ru'
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
def tokenize_function(example):
    return tokenizer(example["sentence"], truncation=True, max_length=128)
tokenized_datasets_raw = raw_datasets.map(tokenize_function, batched=True)

#Prepare for training
tokenized_datasets = tokenized_datasets_raw.remove_columns(["sentence", "idx","level_0"])
tokenized_datasets = tokenized_datasets.rename_column("label", "labels")
tokenized_datasets.set_format("torch")

train_dataset = tokenized_datasets_raw ['train']
eval_dataset = tokenized_datasets_raw ['test']

#CREATE TRAINER
from datasets import load_metric
from transformers import TrainingArguments, Trainer
device = torch.device("cuda")
metric = load_metric("f1")
model = AutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=2).to(device)
training_args = TrainingArguments(output_dir="/home/jupyter/work/resources/Trash", evaluation_strategy="epoch")

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return metric.compute(predictions=predictions, references=labels)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    compute_metrics=compute_metrics,
)
trainer.train()

x = trainer.predict(test_dataset=eval_dataset)

johntheripper43 · May 29, 2022, 6:29pm

I found an error. I set evaluation_strategy=“steps” and the problem was solved.

johannes-garstenauer · August 31, 2023, 11:08am

I am surprised this solved the issue. If you remember, would you mind explaining why that was an error, and how your solution solved the problem?

Topic		Replies	Views
Questions about my first code on fine-tuning BERT model for text-classification Beginners	0	1508	April 26, 2022
Train a Bert Classifier with more than 2 Input Text Columns Beginners	4	1901	October 27, 2023
BERT finetuning "index out of range in self" Intermediate	2	4115	August 24, 2021
Why my simple Bert model for text classification could not learn anything? Beginners	2	2063	October 23, 2023
Trainer only doing 3 epochs no matter the TrainingArguments! Beginners	5	14962	June 20, 2022

Text classifier is trained incorrectly using BERT transformers (f1 = 0) for a certain amount of dataset

Related topics