Excited to share Cathedral-BEIR, a straightforward dense retrieval approach that achieves state-of-the-art results on the BEIR benchmark—0.5881 nDCG@10—using just 768D normalized embeddings from Nomic Embed v1.5 and cosine similarity.
No reranking, no sparse components, no extras: simply prefix queries with “search_query:” for better alignment, encode with the model, and retrieve via dot product.
Tested on SciFact (0.7036), NFCorpus (0.3381), and TREC-COVID (0.7226), it outperforms hybrid baselines like those combining dense + BM25 (~0.52 avg).
Built on BEIR and Sentence Transformers; MIT licensed. Check it out and run the benchmarks yourself!
Repo: https://github.com/Ruffian-L/cathedral-beir
Model: https://huggingface.co/nomic-ai/nomic-embed-text-v1.5
1 Like
Update: Quora (523K passages, 10K queries) just dropped
→ 0.8818 nDCG@10 pure dense
→ 95.26 % Recall@10, 99.45 % Recall@100
Current 5-dataset average: 0.6279 (SciFact, NFCorpus, TREC-COVID, ArguAna, Quora) — already +10 % over the ~0.52 hybrid ceiling everyone accepted as final.
Still zero reranker, zero BM25, zero distillation. Just Nomic Embed v1.5 + one prefix + proper normalization.
1 Like
I’ve just finished a clean, end-to-end evaluation of nomic-embed-text-v1.5 (Matryoshka 512-dim cut) on the full HotpotQA dev set using a pure dense retrieval setup—no reranker, no BM25 fusion, no multi-vector, no training or fine-tuning of any kind.
Setup (exactly the same recipe that gave the BEIR SOTA in this thread):
-
Model: nomic-ai/nomic-embed-text-v1.5 (512-dim MRL slice)
-
Query prefix: “search_query:”
-
Document prefix: “search_document:”
-
L2-normalized embeddings + dot-product similarity
-
FAISS FlatIP index (CPU fallback in this run)
Results on HotpotQA dev (7 405 questions, 5 233 329 passages):
| Metric |
Score |
| nDCG@1 |
0.7805 |
| nDCG@3 |
0.6703 |
| nDCG@5 |
0.6951 |
| nDCG@10 |
0.7151 |
| nDCG@100 |
0.7453 |
| Recall@10 |
0.7493 |
| Recall@100 |
0.8674 |
Throughput: ~841 passages/second embedding, full corpus indexed in ~23 s, top-100 search over 7.4 k queries in ~12.5 min on CPU.
These numbers further confirm that nomic-embed-text-v1.5 delivers exceptional out-of-the-box dense retrieval performance, even on a multi-hop benchmark like HotpotQA.
Looking forward to seeing more community runs—especially curious about other open 512–768-dim models on the same zero-shot protocol.
Thanks again to the Nomic team for releasing such a strong and fully open embedding model.
1 Like