Optimum Failed download of jina-embeddings-v2-base-es

Hi,
I’m trying to download jina-embeddings-v2-base-es as onnx with the following command:

!optimum-cli export onnx --opset 16 --trust-remote-code -m jinaai/jina-embeddings-v2-base-es jina_embeddings_v2_base_es

And get an error:

RuntimeError: The serialized model is larger than the 2GiB limit imposed by the protobuf library. Therefore the output file must be a file path, so that the ONNX external data can be written to the same directory. Please specify the output file name.

I tried with optimization O3 with the same result.
Do you think there is a way to download the model in a compact size?
I’m a beginner with Optimus but I managed to download bge-small-en-v1.5 and all-MiniLM-L6-v2 without problems.
I know that those models are pretty small but, do you think there is a way to download the jina-embeddings-v2-base-es in a compact size?

Hi @Benitoski! If you just want to use an existing ONNX checkpoint in jina-embeddings-v2-base-es, you can simply load it with:

from optimum.onnxruntime import ORTModelForMaskedLM

model = ORTModelForMaskedLM.from_pretrained("jinaai/jina-embeddings-v2-base-es", file_name="onnx/model.onnx")

The command you ran actually exports the model from the PyTorch checkpoint.

Thanks RĂ©gis!

I tried the following:

from optimum.onnxruntime import ORTModelForMaskedLM

model = ORTModelForMaskedLM.from_pretrained(“jinaai/jina-embeddings-v2-base-es”, file_name=“onnx/model.onnx”, use_quantized=True)

!optimum-cli export onnx --opset 16 --trust-remote-code -m model jina_embeddings_v2_base_es jina_embeddings_v2_base_es

But I got a Warning and an error message:

The ONNX file onnx/model.onnx is not a regular name used in optimum.onnxruntime, the ORTModel might not behave as expected.

usage: optimum-cli

Optimum CLI tool: error: unrecognized arguments: jina_embeddings_v2_base_es

For some reason the export does not recognize that I want to store the downloaded model in the folder jina_embeddings_v2_base_es.

Any suggestions?
Thanks in advance

Oscar

The warning message you get with ORTModelForMaskedLM.from_pretrained is quite generic but doesn’t mean your model is not going to behave properly. Have you tested it?

Hi RĂ©gis,

Thanks for your response, I have not been able to test the model because it does not download due to the error I mentioned in the previous email.
The error says:

Optimum CLI tool: error: unrecognized arguments: jina_embeddings_v2_base_es

It seems that the output path is not being recognized.

Can you think of anything I can change in the following to be able to download the model?

from optimum.onnxruntime import ORTModelForMaskedLM

model = ORTModelForMaskedLM.from_pretrained(“jinaai/jina-embeddings-v2-base-es”, file_name=“onnx/model.onnx”, use_quantized=True)

!optimum-cli export onnx --opset 16 --trust-remote-code -m model jina_embeddings_v2_base_es jina_embeddings_v2_base_es

Thanks in advance

Oscar