Inference result not aligned with local version of same model and revision

If it’s for a paid service, using Expert Support is probably the fastest and most reliable option, especially if it seems like a bug.

BTW, on my local PC:

from sentence_transformers import SentenceTransformer # sentence-transformers     4.0.1
import torch
sentences = ["This is an example sentence", "Each sentence is converted"]
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Running on {device}.") # Running on cuda.

model = SentenceTransformer("sentence-transformers/LaBSE").to(device)
embeddings = model.encode(sentences)
print("main:", embeddings)
#main: [[ 0.02882478 -0.00602382 -0.05947006 ... -0.03002249 -0.029607
#   0.00067482]
# [-0.05550233  0.02546483 -0.02157256 ...  0.02932105  0.01150041
#  -0.00848792]]

model = SentenceTransformer("sentence-transformers/LaBSE", revision="836121a0533e5664b21c7aacc5d22951f2b8b25b").to(device)
embeddings = model.encode(sentences)
print("836121a0533e5664b21c7aacc5d22951f2b8b25b:", embeddings)
#836121a0533e5664b21c7aacc5d22951f2b8b25b: [[ 0.02882478 -0.00602382 -0.05947006 ... -0.03002249 -0.029607
#   0.00067482]
# [-0.05550233  0.02546483 -0.02157256 ...  0.02932105  0.01150041
#  -0.00848792]]

model.to("cpu")
embeddings = model.encode(sentences)
print("On CPU:", embeddings)
#On CPU: [[ 0.02882476 -0.00602385 -0.05947007 ... -0.03002251 -0.02960699
#   0.00067482]
# [-0.05550234  0.02546484 -0.02157255 ...  0.02932107  0.01150037
#  -0.00848786]]