RAG Retriever: hf vs legacy vs exact vs compressed

We saw your post saying exact index has to be used for the replication of the paper RAG Retriever : Exact vs. Compressed Index?
However, HuggingFace documentation says that legacy index replicates the paper’s results RAG — transformers 4.11.2 documentation

The compressed index has lower retrieval performance than the exact one. That’s why you need to use either the exact or the legacy one to replicate RAG’s performance.

While using legacy to load the pretrained RAG retriever, we faced “MemoryError: std::bad_alloc”.

The legacy index is 35GB and it needs to fit in RAM. Make sure you have enough RAM when using it.

Is there any text based wiki dump that helps us to replicate the paper’s result while being smaller than 140G?

You can use the wiki dump from the legacy index:
I think this one takes less disk space because the embeddings are stored quantized in the FAISS index (whereas the other indexes store the plain embeddings and it takes around 70GB to download and 70GB to convert to an Arrow dataset file). So you would have around 10GB + the faiss index (35GB).