Hi friend, I believe the site will come back soon since – refering to our last conversation – now even https://huggingface.co/qa/ is back after several days of down-time …
BTW, the RAG link is not about blog but it’s a RAG demo similar to the long-form QA demo we have discussed. I still have the screen-capture of the link , so I post them here as teasers for you while you are waiting
Thanks @Jung for the screen shot, it looks great. Looking forward for the site to be online again
Maybe @yjernite could help?
Hi @cahya! Yes, the link is supposed to link to the demo in this case, we didn’t write a long blog post as we did for ELI5
If you want more details, you can refer to the paper for questions about the model or ask here if you have questions about the specifics of the demo
Thanks @yjernite for bringing the demo online again. I will try it out and compare it with the eli5 you have create it (is it comparable btw?)
The eli5 demo is supposed to be able to answer questions that require a long explanation, while RAG focuses on factoids. You can try using the example questions from the one in the other to see the difference
Hi @yjernite , could you please restart the demo again? It seems it has memory issue. Thanks.
Btw, As I understand the bart in eli5 model ist trained with input using question from eli5 and some passages, and the output using the answer from eli5. Then how is the Bart model in RAG trained?
Bart-in-RAG and Yacine’s Bart-in-ELI5 is very similar in design.
Overall pipeline, RAG and Bart-ELI5 query ‘question’ to retrieve passages (5 passages as default), then concat both question-passage together as input for Bart-encoder.
In the paper and Huggingface’s pretrained weights, RAG uses NQ dataset, so it’s short-form QA unlike ELI5.
Note that from 1 example we have in NQ, in RAG, we will extend it to 5 examples :
Consider one training data of original NQ : Q / A (Question and Answer pair)
In RAG, this NQ training data becomes : Q-P1 / A , Q-P2 / A , Q-P3 / A , Q-P4 / A, Q-P5 / A
where P1-P5 are retrieve passages. And we train Bart with these new pairs.
Thanks @Jung for the explanation. And how about the inference? If we retrieve 5 passages, should we concat the question and all 5 passages to the input like in eli5? That is different than from the training
Hi Cahya, on inference/generate, RAG will make an average over the 5 extended-inputs.
RAG proposes two methods to average the result (each is an approximation of Bayesian marginalization) called ‘token’ and ‘sequence’ methods , respectively.
That’s why we have two Huggingface RagGeneration models (the average is already built-in so we don’t have to do anything)
Thanks again @Jung for the great explanation, I got it now