Which solution is best suited in my case?

Hello Team,

I’m currently looking to create an API for a model that can answer questions based on proprietary materials, like my own notes or books. I want the model to be able to provide answers directly from these sources. Could you suggest which machine learning techniques would be best suited for this? Additionally, what hosting platforms should I use for both the API and dataset storage? It would also be helpful to understand the associated costs for these options.

1 Like

This is not an easy use case if it is completely local. For example, would it be nice if we could do something like the following space?
This space seems to use 8B LLMs, so the calculation is that a little over 20GB if you don’t quantize, and about 6GB of VRAM would work somehow if you do 4-bit quantization.
This space is running without VRAM because the HF server is taking care of the processing. However, it is online, and if we use too much, we get an error for about an hour. Unlimited service exists if you pay, but it is a pay-as-you-go service.

Anyway, once you find some base Space, it will be easy to realize it.

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.