Difference between CausalLMWithValueHead vs ModelForCausalLM

I know what is AutoModelForCausalLM. The thing I’m asking is that in the peft LoRA Fine tuning tutorial, the autors have used AutoModelForCausalLMWithValueHead while you pick any code or notebook on Fine-tuning of any LLM with PEFT style, you’ll find AutoModelForCausalLM being used.

I went to lean on the official documentation of AutoModelForCausalLMWithValueHead and found:

An autoregressive model with a value head in addition to the language model head

What I want to ask is that How, where and more importantly, WHY this extra ValueHead is used