Models for Multi-lingual Embeddings (similarity search)?

Good day,

I have a use case for text-search (similarity based) for non-English language (Vietnamese in particular).

Hoping I could pls get some pointers on how to use HF’s model to generate embedding (for vector DB). I’ve found this one in particular that is promising: VoVanPhuc/sup-SimCSE-VietNamese-phobert-base · Hugging Face. It’s a Transformers that is suitable for ‘sentence similarity’. This is what I found from searching:

# Generate the embeddings using the model
with torch.no_grad():
    model_output = model(**encoded_input)
    embeddings = model_output.last_hidden_state.mean(dim=1)

Also, not sure if I’m looking at the right task (new to Hugging Face) :laughing:

Much appreciated!