I see how you can upload the model itself and create an endpoint to it, but how does it work for corpus embedding vectors (from model.encode)? How do I create an endpoint or inference so my app can access it?
Hello there! You can upload your pre-computed embeddings as a Dataset on the Hub: Getting Started With Embeddings
This makes the embeddings available via the Datasets library, but it doesn’t give you an endpoint for access via an API. Without knowing more about your use case, though, it’s difficult to give better suggestions. Could you share more details about what you’re building?
Awesome thanks for pointing it out. Will use it. Any limit on storage size?
For the API, I think I’ll use Flask and host it on AWS with REST.
My other question for creating the dataset is:
- How do you store embeddings as a single column within a dataframe?
- How do you extract the embeddings out from that single column when you retrieve the dataset?
Sorry, I missed your post!
Any limit on storage size?
We don’t really have a limit! You might want to shard your files though, more on that here: Is there a size limit for dataset hosting - #4 by julien-c
For the other questions:
- As far as I’m aware, you can just set
df["embeddings"] = embeddingsor something like that, and it should be fine
- If you’ve stored each embedding like above, when you retrieve the particular subset of the dataframe for your operation you can just call
Hope I understood your question – let me know if this didn’t help!