Chat with large data set of document - best approach?

Hi all;

I was wondering what is the best approach in case of large dataset of Documents : 500 Go to retrieve specific information based on a prompt ? embedding still enough for that volume of Data ?
what can you suggest ?

Thank you