Getting No log in validation_loss

sai-santhosh · February 14, 2025, 3:01pm

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.2-3B",
    quantization_config = bnb_config,
    trust_remote_code = True
).to(device)

tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-3B")

from peft import prepare_model_for_kbit_training, LoraConfig, get_peft_model

lora_config = LoraConfig(
    r=8,
    lora_alpha=32,
    target_modules= ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    lora_dropout=0.05,
    bias="none",
)

model = prepare_model_for_kbit_training(model)

model = get_peft_model(model, lora_config)

from trl import setup_chat_format

model,tokenizer = setup_chat_format(model,tokenizer)

from trl import SFTConfig, SFTTrainer

args = SFTConfig(
    output_dir = "lora_model/",
    per_device_train_batch_size = 4,
    per_device_eval_batch_size = 4,
    learning_rate = 2e-05,
    gradient_accumulation_steps = 2,
    max_steps = 150,
    logging_strategy = "steps",
    logging_steps = 5,
    save_strategy = "steps",
    save_steps = 25,
    eval_strategy = "steps",
    eval_steps = 5,
    lr_scheduler_type = "cosine",
    fp16 = True,
    data_seed=42,
    max_seq_length = 2048,
    report_to = "none",
)

trainer = SFTTrainer(
    model = model,
    args = args,
    processing_class = tokenizer,
    train_dataset = dataset['train'],
    eval_dataset = dataset['test'])

why am i getting no log for validation loss

John6666 · February 14, 2025, 3:16pm

Perhaps this?

For LLM validation loss you might also need label_names=[] in the TrainingArguments and set trainer.can_return_loss = True to satisfy transformers/src/transformers/trainer.py at v4.40.2 · huggingface/transformers · GitHub for some models.

sai-santhosh · February 14, 2025, 3:37pm

im working on conversational data, so i wont be able to create label_names

John6666 · February 14, 2025, 4:10pm

Hmmm… I think I may have found a way to forcefully record it.

github.com/unslothai/unsloth

No Validation Loss logged (possibly related to train_on_responses_only?)

opened 07:04PM - 12 Sep 24 UTC

selalipop

fixed - pending confirmation

Evaluations are being run, _but no validation loss is logged or sent to WandB_ … The console shows that eval is running, but displays a table along the lines of: | eval loss | validation loss | |-----------|-----------------| | 1 | no log | | .9 | no log | | .8 | no log | WandB shows evidence validation run occurs, but doesn't display loss either: <img width="1351" alt="image" src="https://github.com/user-attachments/assets/b2934110-8e6d-4df2-aa4c-7fb1bdbf03e0"> ``` from trl import SFTTrainer from transformers import TrainingArguments, DataCollatorForSeq2Seq from unsloth import is_bfloat16_supported import os os.environ["WANDB_PROJECT"] = "my_project" # name your W&B project os.environ["WANDB_LOG_MODEL"] = "checkpoint" # log all model checkpoints trainer = SFTTrainer( model = model, tokenizer = tokenizer, train_dataset = train_dataset, eval_dataset=validation_dataset, dataset_text_field = "text", max_seq_length = max_seq_length, data_collator = DataCollatorForSeq2Seq(tokenizer = tokenizer), dataset_num_proc = 2, packing = False, # Can make training 5x faster for short sequences. args = TrainingArguments( fp16_full_eval = True, per_device_eval_batch_size = 2, eval_accumulation_steps = 4, report_to = "wandb", run_name = "run-name-here", per_device_train_batch_size = 4, gradient_accumulation_steps = 4, gradient_checkpointing=True, warmup_steps = 5, do_predict=True, logging_first_step=True, num_train_epochs = 3, # Set this for 1 full training run. save_steps = 88, evaluation_strategy="steps", eval_steps=88, do_eval=True, learning_rate = 1e-4, fp16 = not is_bfloat16_supported(), bf16 = is_bfloat16_supported(), logging_steps = 1, optim = "adamw_8bit", weight_decay = 0.01, lr_scheduler_type = "linear", seed = 3407, output_dir = "outputs", ), ) --- from unsloth.chat_templates import train_on_responses_only trainer = train_on_responses_only( trainer, instruction_part = "<|start_header_id|>user<|end_header_id|>\n\n", response_part = "<|start_header_id|>assistant<|end_header_id|>\n\n", ) trainer.train() ``` Very similar settings work when using plain SFTTrainer in another project

Topic		Replies	Views
Trainer() shows no log for validation loss when using PEFT 🤗Transformers	2	538	September 11, 2024
Trainer do not log validation loss and metrics Beginners	5	1633	September 1, 2024
SFTTrainer not reporting training loss Beginners	0	614	April 8, 2024
Loading pre-trained models with AddedTokens 🤗Transformers	2	749	October 14, 2024
Unable to train model (Loss is 0.000000) DeepSpeed	2	1093	October 17, 2023

Getting No log in validation_loss

Related topics