Dear authors of RAG model,
I know I can finetune with the rag with following example.
retriever = RagRetriever.from_pretrained(rag_example_args.rag_model_name, index_name="custom", passages_path=passages_path, index_path=index_path)
model = RagSequenceForGeneration.from_pretrained(rag_example_args.rag_model_name, retriever=retriever,cache_dir=cache_dir).to(device)
tokenizer = RagTokenizer.from_pretrained(rag_example_args.rag_model_name,cache_dir=cache_dir)
inputs = tokenizer("How many people live in Paris?", return_tensors="pt")
with tokenizer.as_target_tokenizer():
targets = tokenizer("In Paris, there are 10 million people.", return_tensors="pt")
input_ids = inputs["input_ids"].to(device)
labels = targets["input_ids"].to(device)
outputs = model(input_ids=input_ids, labels=labels)
However, this is for single sentence.
How can I finetune with mini batch qa samples?
Could you give an example?
Thank you very much!
@patrickvonplaten @lhoestq