Phi 3.5 Tokenizer warnings

Tried deploying an unmodified version of microsoft/Phi-3.5-mini-instruct on a dedicated Inference Endpoint.
The model is detected as supported by Text Generation Inference, and when loading it through this optimized container, I’m seeing the following warnings in the logs:

|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791424Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|endoftext|>' was expected to have ID '32000' but was given ID 'None'|
|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791447Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|assistant|>' was expected to have ID '32001' but was given ID 'None'|
|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791453Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|placeholder1|>' was expected to have ID '32002' but was given ID 'None'|
|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791457Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|placeholder2|>' was expected to have ID '32003' but was given ID 'None'|
|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791462Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|placeholder3|>' was expected to have ID '32004' but was given ID 'None'|
|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791466Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|placeholder4|>' was expected to have ID '32005' but was given ID 'None'|
|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791470Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|system|>' was expected to have ID '32006' but was given ID 'None'|
|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791476Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|end|>' was expected to have ID '32007' but was given ID 'None'|
|09/24/2024, 08:29:28|WARN|2024-09-24T08:29:28.791481Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|placeholder5|>' was expected to have ID '32008' but was given ID 'None'|

The warning does not appear when using the “Default” container, only in TGI.

Does anyone know what the effect of these warnings are? The model is still able to differentiate between <|end|> and <|assistant|> tokens in the inference request’s input.

1 Like

I have no idea of the detailed logic, but it seems that the contents of this file are not being reflected properly.

I wonder if the model class that is automatically chosen when loading with transformers on the server side is wrong?
README.md is a kind of program with a YAML header, but most people think it’s just an instruction manual, so it’s often mishandled and contains false information.

Also, I think there are cases where the support in Inference is delayed even if the transformers and diffusers already support it.

2 Likes