Different embeddings when using sentence transformers and transformers.js

danielruss · February 15, 2024, 3:46pm

Hello, I am building a multilabel classifier that uses the embeddings from sentence-transformers/all-MiniLM-L6-v2 as input. After training a model that produces good enough results, I would like to run this in the brower using transformersjs and the Xenova/all-MiniLM-L6-v2 model. However, I am getting different embeddings for the same text.

Here is my python code:

model_name = "sentence-transformers/all-MiniLM-L6-v2"
mdl = SentenceTransformer(model_name)
raw_inputs = [
    "I've been waiting for a HuggingFace course my whole life.",
    "I hate this so much!",
]
se = mdl.encode(raw_inputs)
# the first 4 dimensions...
# [[-0.0635541   0.00168205  0.08878317  0.01061784]
#  [-0.0278877   0.02493023  0.01891949  0.03274209]]

The js code:

import { pipeline } from 'https://cdn.jsdelivr.net/npm/@xenova/transformers@latest';
let extractor = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2')
let result = await extractor( [
    "I've been waiting for a HuggingFace course my whole life.",
    "I hate this so much!",
], { pooling: 'mean', normalize: true });
console.log( result.data.slice(0,4) )
console.log( result.data.slice(384,388) )

// [-0.0713, 0.0169, 0.0940, 0.00842]
// [-0.0041, 0.0070, 0.0365 0.0422]

I would like to reproduce the sentence transformer embeddings. If this is not possible, I just need to have the same embeddings between python and javascript, and I will try to retrain. My specific questions are:

Am I doing this correctly?
If so, can I get the javascript embeddings to match?
Thank you

Topic		Replies	Views
Using Accelerated Inference API to produce sentense embeddings 🤗Transformers	16	2221	April 12, 2023
All-mpnet-base-v2 get different results in Spark-NLP vs SentenceTransformers 🤗Transformers	2	67	January 6, 2025
No sentence-transformers model found with name sentence-transformers/all-MiniLM-L6-v2 Beginners	2	4482	April 30, 2024
How to use embeddings to compute similarity? Beginners	4	4478	January 27, 2022
Slightly different embeddings for pandas series using a sentence transformer Models	1	625	December 18, 2023

Different embeddings when using sentence transformers and transformers.js

Related topics