How do I use the RagRetriever to retrieve documents? (What is the question_hidden_states variable and how do make it?)

pambalos · March 18, 2024, 4:40am

I have a Rag Model and Retriever using the facebook/rag-sequence-nq model, and I have a couple of questions about how to retrieve documents. I went ahead and built a model following this guide transformers/examples/research_projects/rag/use_own_knowledge_dataset.py at main · huggingface/transformers · GitHub - but I am at a bit of a loss as to how to get the documents used to generate the answer it gives me. It expects a question_hidden_states variable that I can’t figure out how to make, nor what exactly it is. Any advice and help would be much appreciated.

retriever = RagRetriever.from_pretrained("facebook/rag-sequence-nq", index_name="custom", indexed_dataset=dataset)
model = RagSequenceForGeneration.from_pretrained("facebook/rag-sequence-nq", retriever=retriever)
model.config.output_hidden_states = True
tokenizer = RagTokenizer.from_pretrained("facebook/rag-sequence-nq")
question2 = "What are the best antidepressants for depression?"
tokenization = tokenizer.question_encoder(question2, return_tensors="pt")
generated = model.generate(tokenization.input_ids, output_hidden_states=True)
generated_string = tokenizer.batch_decode(generated, skip_special_tokens=True)[0]

print("Q: " + question2) # Q: What are the best antidepressants for depression?
print("A: " + generated_string) # A:  selective serotonin reuptake inhibitors

docs = retriever.retrieve(tokenization, n_docs=5) # Fails because it expects a question_hidden_states?

pambalos · March 18, 2024, 10:20pm

Found an example RAG

Topic		Replies	Views
Trying RAG with other Retriever Models 🤗Transformers	0	428	January 21, 2021
Facing issue building a simple RAG application using RetrievalQA Beginners	2	64	May 30, 2025
Rag model set up 🤗Transformers	0	696	November 7, 2023
RAG for Reading Comprehension Models	1	716	April 6, 2021
Using RAG with local documents Models	3	3668	April 21, 2021

How do I use the RagRetriever to retrieve documents? (What is the question_hidden_states variable and how do make it?)

Related topics