Now the Huggiface RAG consists of a script where we can use a custom dataset other than the wiki-dataset.
Since, in the fine-tuning phase of the RAG, we do not update the doc-encoder (we update only BART and Question Encoder), what if our custom dataset consists of different distribution compared to the wiki dataset (Ex: medical records)?
Will it still work?
P.S - In the RAG paper authors just used the pretrained DPR and they never updated the doc encoder weights in the fine-tuning mechanism.