I am trying to use Hugginface’s AutoModelForSequence Classification API for multi-class classification but am confused about its configuration.
My dataset is in one hot encoded and the problem type is multi-class (one label at a time)
What I have tried:
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased",
num_labels=6,
id2label=id2label,
label2id=label2id)
batch_size = 8
metric_name = "f1"
from transformers import TrainingArguments, Trainer
args = TrainingArguments(
f"bert-finetuned-english",
evaluation_strategy = "epoch",
save_strategy = "epoch",
learning_rate=2e-5,
per_device_train_batch_size=batch_size,
per_device_eval_batch_size=batch_size,
num_train_epochs=10,
weight_decay=0.01,
load_best_model_at_end=True,
metric_for_best_model=metric_name,
#push_to_hub=True,
)
trainer = Trainer(
model,
args,
train_dataset=encoded_dataset["train"],
eval_dataset=encoded_dataset["test"],
tokenizer=tokenizer,
compute_metrics=compute_metrics
)
is it correct?
I am confused about the loss function, when I am printing one forward pass the loss is BinaryCrossEntropyWithLogits
SequenceClassifierOutput([('loss',
tensor(0.6986, grad_fn=<BinaryCrossEntropyWithLogitsBackward0>)),
('logits',
tensor([[-0.5496, 0.0793, -0.5429, -0.1162, -0.0551]],
grad_fn=<AddmmBackward0>))])
which is used for multi-label or binary classification tasks. It should use nn.CrossEntropyLoss?
How to properly use this API for multiclass and define the loss function?