Different embeddings when using sentence transformers and transformers.js

Hello, I am building a multilabel classifier that uses the embeddings from sentence-transformers/all-MiniLM-L6-v2 as input. After training a model that produces good enough results, I would like to run this in the brower using transformersjs and the Xenova/all-MiniLM-L6-v2 model. However, I am getting different embeddings for the same text.

Here is my python code:

model_name = "sentence-transformers/all-MiniLM-L6-v2"
mdl = SentenceTransformer(model_name)
raw_inputs = [
    "I've been waiting for a HuggingFace course my whole life.",
    "I hate this so much!",
]
se = mdl.encode(raw_inputs)
# the first 4 dimensions...
# [[-0.0635541   0.00168205  0.08878317  0.01061784]
#  [-0.0278877   0.02493023  0.01891949  0.03274209]]

The js code:

import { pipeline } from 'https://cdn.jsdelivr.net/npm/@xenova/transformers@latest';
let extractor = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2')
let result = await extractor( [
    "I've been waiting for a HuggingFace course my whole life.",
    "I hate this so much!",
], { pooling: 'mean', normalize: true });
console.log( result.data.slice(0,4) )
console.log( result.data.slice(384,388) )

// [-0.0713, 0.0169, 0.0940, 0.00842]
// [-0.0041, 0.0070, 0.0365 0.0422]

I would like to reproduce the sentence transformer embeddings. If this is not possible, I just need to have the same embeddings between python and javascript, and I will try to retrain. My specific questions are:

  1. Am I doing this correctly?
  2. If so, can I get the javascript embeddings to match?
    Thank you

Hi,
I recognized the same with sentence-transformers and transformers.
Why is this the case and do I make a mistake or is it ok, as long as you use the same way for all embeddings you create?

My goal is building a local RAG application

from sentence_transformers import SentenceTransformer
from transformers import AutoTokenizer, AutoModel
import torch
import pandas as pd

# sentence-transformer version
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

sentences = [
    "This framework generates embeddings for each input sentence",
    "Sentences are passed as a list of strings.",
    "The quick brown fox jumps over the lazy dog.",
]

embeddings = model.encode(sentences)
df = pd.DataFrame(embeddings, index=sentences)
print("sentence-transformer version")
print(df)


# ---------------------------------------------------------------
# transformer version

# Mean Pooling - Take attention mask into account for correct averaging
def mean_pooling(model_output, attention_mask):
    token_embeddings = model_output[0]  # First element of model_output contains all token embeddings
    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
    sum_embeddings = torch.sum(token_embeddings * input_mask_expanded, 1)
    sum_mask = torch.clamp(input_mask_expanded.sum(1), min=1e-9)
    return sum_embeddings / sum_mask



# Load AutoModel from huggingface model repository
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
model = AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")

# Tokenize sentences
encoded_input = tokenizer(
    sentences, padding=True, truncation=True, max_length=128, return_tensors="pt"
)

# Compute token embeddings
with torch.no_grad():
    model_output = model(**encoded_input)

# Perform pooling. In this case, mean pooling
sentence_embeddings = mean_pooling(model_output, encoded_input["attention_mask"])
df2 = pd.DataFrame(sentence_embeddings.numpy(), index=sentences)
print("transformer version")
print(df2)

@Stefan-LTB I was able to get the embeddings to match. I believe it was caused because transformersjs preferentially uses quantized models. So by adding pipeline('feature-extraction','Xenova/all-MiniLM-L6-v2',{quantized:false}) in my js code. The values looked the same.

I am not sure why your transformers and sentence_transformers are not providing the same results. I’m pretty sure I looked at this case also. You may want to double check your masking.