Hi, I converted the BAAI/bge-base-en-v1.5 model to onnx format and did some performance testing.
It seems that at small sequence lengths the ONNX model is faster than a transformers and sentence_transformers implementation.
However, as the sequence length increased it goes from being almost twice the speed to actually being slower. Does anyone know why this is?