IterableDataset doesn’t support vector similarity search, because, with it, you only have access to one example at a time. It seems that both Faiss and ElasticSearch support memory mapping, so we will probably add support for that to the
Dataset class soon.
Some external resources that coud help:
- Faiss - Indexes that do not fit in RAM · facebookresearch/faiss Wiki · GitHub
- ElasticSearch (
mmapfs) - Store | Elasticsearch Guide [7.16] | Elastic