It seems like whenever it tries to go download the model weights from HF, it is constantly looking for the ONNX version of the weights, which does not exist. In the current model repo, the model weights only exist in a safetensor format, which is why I suppose I am getting the error above. Anyone encountered the same problem? Thank you!
I am also getting this error when I try and serve a TEI endpoint using docker. I had to switch to a model that had weights in ONNX, and unfortunately could not use gte-Qwen for my embeddings.