I have cloned embedding model udever-bloom-1b1.
At first it asked for onnx files as it didnt have so i convert them to onnx.
Then pushed to my repo here
I also configured config.json
with {"max_position_embeddings": 2048,}
. After that copied 1_Pooling/config.json
from jinaai embeddings .
{
"word_embedding_dimension": 1536,
"pooling_mode_cls_token": false,
"pooling_mode_mean_tokens": true,
"pooling_mode_max_tokens": false,
"pooling_mode_mean_sqrt_len_tokens": false
}
Now when running embedding server
2024-08-05T09:52:57.224194Z INFO hf_hub: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/hf-hub-0.3.2/src/lib.rs:55: Token file not found "/root/.cache/huggingface/token"
2024-08-05T09:52:57.307450Z INFO download_pool_config: text_embeddings_core::download: core/src/download.rs:38: Downloading `1_Pooling/config.json`
2024-08-05T09:53:00.166636Z INFO download_new_st_config: text_embeddings_core::download: core/src/download.rs:62: Downloading `config_sentence_transformers.json`
2024-08-05T09:53:00.405920Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:21: Starting download
2024-08-05T09:53:00.405936Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:23: Downloading `config.json`
2024-08-05T09:53:00.885433Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:26: Downloading `tokenizer.json`
2024-08-05T09:53:05.337696Z INFO download_artifacts: text_embeddings_backend: backends/src/lib.rs:313: Downloading `model.onnx`
2024-08-05T09:53:05.990842Z INFO download_artifacts: text_embeddings_core::download: core/src/download.rs:32: Model artifacts downloaded in 5.584921999s
2024-08-05T09:53:06.456293Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '[BOQ]' was expected to have ID '250680' but was given ID 'None'
2024-08-05T09:53:06.456307Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '[EOQ]' was expected to have ID '250681' but was given ID 'None'
2024-08-05T09:53:06.456310Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '[BOD]' was expected to have ID '250682' but was given ID 'None'
2024-08-05T09:53:06.456313Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '[EOD]' was expected to have ID '250683' but was given ID 'None'
2024-08-05T09:53:06.457094Z WARN text_embeddings_router: router/src/lib.rs:195: Could not find a Sentence Transformers config
2024-08-05T09:53:06.457162Z INFO text_embeddings_router: router/src/lib.rs:199: Maximum number of tokens per request: 2048
2024-08-05T09:53:06.457515Z INFO text_embeddings_core::tokenization: core/src/tokenization.rs:28: Starting 4 tokenization workers
2024-08-05T09:53:06.936365Z INFO text_embeddings_router: router/src/lib.rs:241: Starting model backend
Error: Could not create backend
Caused by:
Could not start backend: Failed to create ONNX Runtime session: Deserialize tensor h.8.input_layernorm.weight failed.GetFileLength for /data/models--Saugatkafley--udever-bloom-1b1-onnx/snapshots/698acf469fd193b51dd1125dbd460c8258c7b606/model.onnx_data failed:Invalid fd was supplied: -1