datasets.Dataset.get_nearest_examples() on GPU

Hello everyone,

Is it possible to put get_nearest_examples() computation on GPU after having added a FAISS index?

You can specify the GPU ID when you instantiated the index. Will use the CPU by default (None).

add_faiss_index(column, device=0)

Thank you, Bram. Tried a similar solution when loading my faiss_index.

However, throws an error: AttributeError: module 'faiss' has no attribute 'StandardGpuResources

I have created a topic in facebookresearch/faiss repository - AttributeError: module ‘faiss’ has no attribute ‘StandardGpuResources’ on adding FAISS index to Hugging face Dataset.

You probably installed a prebuilt faiss-cpu rather than faiss-gpu. Remove that one and install faiss-gpu instead.

1 Like

Thank you Bram. I have some updates.

FAISS installation (1)

  1. Not having faiss-cpu, just having faiss-gpu. Installed it by:
    conda install faiss-gpu cudatoolkit=10.2 -c pytorch

According to (1), it should be cudatoolkit=10.0, but that leads to downgrades that can lead to future problems.

throws
AssertionError: You must install Faiss to use FaissIndex. To do so you can run pip install faiss-cpu or pip install faiss-gpu

  1. Not having faiss-cpu, just having faiss-gpu. Installed it by

pip install faiss-gpu

Problem solved.

Cheers

I don’t use conda so I’m not sure I can help with that - but if I understand you correctly, the problem has been solved by using pip install?

2 Likes

Yes, thank you.

1 Like