Trainer do not log validation loss and metrics

lanhf · November 10, 2023, 8:01pm

Hello, today I use Trainer to train a Lora model, but there is no log for validation loss and metrics in the results of trainer.train(). The codes are as follows:

accuracy = evaluate.load("accuracy")

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return accuracy.compute(predictions=predictions, references=labels)


tokenizer = AutoTokenizer.from_pretrained("vinai/phobert-base-v2")
dataset = load_from_disk("data")


def tokenize(batch):
    return tokenizer(batch["sentence"], truncation=True, max_length=150)


dataset = dataset.map(tokenize, batched=True).remove_columns(["sentence"])
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
dataset.set_format(
    "torch", device=device, columns=["input_ids", "attention_mask", "labels"]
)
### input_ids must be the first column
dataset = dataset.map(lambda batch: {"new_labels": batch["labels"]}, batched=True)
dataset = dataset.remove_columns("labels")
dataset = dataset.rename_column("new_labels", "labels")

from peft import LoraConfig, TaskType, get_peft_model
torch.backends.cuda.matmul.allow_tf32 = True
torch.backends.cudnn.allow_tf32 = True

peft_config = LoraConfig(
    TaskType.SEQ_CLS, "vinai/phobert-base-v2", r=8, lora_alpha=8, lora_dropout=0.1
)
model = AutoModelForSequenceClassification.from_pretrained(
    "vinai/phobert-base-v2", num_labels=2
)
model = get_peft_model(model, peft_config)

args = TrainingArguments(
    output_dir="./checkpoints",
    overwrite_output_dir=True,
    evaluation_strategy="epoch",
    per_device_eval_batch_size=64,
    per_device_train_batch_size=64,
    gradient_accumulation_steps=4,
    optim="adamw_torch_fused",
    tf32=True,
    learning_rate=5e-5,
    weight_decay=0.01,
    num_train_epochs=10,
    logging_strategy="epoch",
    save_strategy="epoch",
    dataloader_num_workers=10,
    remove_unused_columns=False
)

args.set_dataloader(auto_find_batch_size=True)

trainer = Trainer(
    model=model,
    args=args,
    tokenizer=tokenizer,
    train_dataset=dataset["train"],
    eval_dataset=dataset["validation"],
    compute_metrics=compute_metrics,
)

trainer.train()

The output are:

surya-narayanan · March 1, 2024, 4:24am

Ive been dealing with this same issue, and it looks like you have to provide a name of a column that contains your labels, for example pass this as an argument to your trainer.

label_names = ["labels"],

waxef · March 2, 2024, 10:23am

Thanks for this reply, I was facing a similar issue.
I haven’t tried it yet to see if it solves my solution as well.
If you don’t mind me asking a clarifying question, is the argument label_names to the trainer dependent on what model one uses or is it a generic argument regardless of the underlying model that is being used?

As beginner with hf library on one hand we are all grateful that it exists on the other hand IMHO it’s a mess, it didn’t copy the good practices from other libraries like scikit or pytorch.

e.g. from transformers import X where X can be anything under the sun, instead of having a more structured approach from transformers.models import X, from transformers.tokenizers import Y, from transformers.datasets import Z, etc.

Aso , the whole AutoModelXYZ is utterly confusing, it would have been much more clear if we only had from transformers.models import Model, ModelConfig and then in the ModelConfig one defines whatever task they are interested in.

waxef · March 4, 2024, 9:11am

This didn’t work in my case

surya-narayanan · March 5, 2024, 7:19pm

the keyword goes in TrainingArguments, not in Trainer

madog · September 1, 2024, 10:38am

I confirm that this fixes the error, thank you so much, it took some time to found the solution

Topic		Replies	Views
Trainer() shows no log for validation loss when using PEFT 🤗Transformers	2	549	September 11, 2024
No log for validation loss in trainer.train() Beginners	4	6120	April 13, 2024
Why do I get no validation loss and why are metrics not calculated? Beginners	4	5410	February 28, 2025
Trainer API to log both Training and Validation Metrics 🤗Transformers	2	1683	July 1, 2021
Trainer not logging eval_loss Beginners	2	920	April 26, 2021

Trainer do not log validation loss and metrics

Related topics