How retrieval loss is calculated in RAG model?

ritwikm · February 14, 2024, 7:54am

I was reading the code for RAG (Retrieval Augmented Generation) on transformers github.

I wanted to know how gradients are backpropagated till query encoder model. I wrote an answer for it.

But then I wondered… how come loss of retrieval model (query encoder) is calculated by simply taking the softmax over the doc_scores? Here

I get that they are adding the softmax over doc_scores to the seq_logits loss in order backpropagate the gradients. But I am unable to understand the intuition behind doing a softmax over doc_scores? and why that prob distribution is added to seq_logits loss? Both are different

Topic		Replies	Views
RAG Example and Word-Level contributions Models	4	1926	October 12, 2020
Sentence transformers - SoftmaxLoss Models	1	968	June 20, 2024
Debugging the RAG question encoder Research	2	579	February 10, 2021
RAG model doesn't have grad_fn on loss Beginners	0	390	December 13, 2021
Understanding the encoder-decoder loss calculation VS CLM loss Beginners	0	344	February 21, 2024

How retrieval loss is calculated in RAG model?

Related topics