Tried deploying an unmodified version of microsoft/Phi-3.5-mini-instruct on a dedicated Inference Endpoint.
The model is detected as supported by Text Generation Inference, and when loading it through this optimized container, I’m seeing the following warnings in the logs:
|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791424Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|endoftext|>' was expected to have ID '32000' but was given ID 'None'|
|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791447Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|assistant|>' was expected to have ID '32001' but was given ID 'None'|
|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791453Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|placeholder1|>' was expected to have ID '32002' but was given ID 'None'|
|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791457Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|placeholder2|>' was expected to have ID '32003' but was given ID 'None'|
|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791462Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|placeholder3|>' was expected to have ID '32004' but was given ID 'None'|
|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791466Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|placeholder4|>' was expected to have ID '32005' but was given ID 'None'|
|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791470Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|system|>' was expected to have ID '32006' but was given ID 'None'|
|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791476Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|end|>' was expected to have ID '32007' but was given ID 'None'|
|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791481Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|placeholder5|>' was expected to have ID '32008' but was given ID 'None'|
The warning does not appear when using the “Default” container, only in TGI.
Does anyone know what the effect of these warnings are? The model is still able to differentiate between <|end|>
and <|assistant|>
tokens in the inference request’s input.