I have a Rag Model and Retriever using the facebook/rag-sequence-nq model, and I have a couple of questions about how to retrieve documents. I went ahead and built a model following this guide transformers/examples/research_projects/rag/use_own_knowledge_dataset.py at main · huggingface/transformers · GitHub - but I am at a bit of a loss as to how to get the documents used to generate the answer it gives me. It expects a question_hidden_states variable that I can’t figure out how to make, nor what exactly it is. Any advice and help would be much appreciated.
retriever = RagRetriever.from_pretrained("facebook/rag-sequence-nq", index_name="custom", indexed_dataset=dataset)
model = RagSequenceForGeneration.from_pretrained("facebook/rag-sequence-nq", retriever=retriever)
model.config.output_hidden_states = True
tokenizer = RagTokenizer.from_pretrained("facebook/rag-sequence-nq")
question2 = "What are the best antidepressants for depression?"
tokenization = tokenizer.question_encoder(question2, return_tensors="pt")
generated = model.generate(tokenization.input_ids, output_hidden_states=True)
generated_string = tokenizer.batch_decode(generated, skip_special_tokens=True)[0]
print("Q: " + question2) # Q: What are the best antidepressants for depression?
print("A: " + generated_string) # A: selective serotonin reuptake inhibitors
docs = retriever.retrieve(tokenization, n_docs=5) # Fails because it expects a question_hidden_states?