Google has released LaBSE v2
https://tfhub.dev/google/LaBSE/2
Their example code uses multilingual keras preprocessor (link in their code to: “universal-sentence-encoder-cmlm/multilingual-preprocess/2”)
I wonder if there would be a guide on:
- how to load and generate embeddings in Huggingface (for example on doing cosine similarity comparison between different texts)
- how to fine-tune the model for sequence classification (e.g. multi-label or multi-class topic classification)
There’s an example on Labse v1 with huggingface and sentence transformers regarding “creating embeddings” but would like to know a beginner friendly guide for v2