Very long warning when running rag-end2end-retriever

I’m trying to run rag end2end retriever as here:

I encountered very long warning sequences with many tokenizer mismatches. Is this normal? I pasted a part of it - there are many more following this.

My environment:
pytorch:1.12.1-cuda11.3-cudnn8
pytorch-lightning==1.6.4
transformers==4.21.2

@shamanez

Some of the output:

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is ‘DPRQuestionEncoderTokenizer’.
The class this function is called from is ‘DPRContextEncoderTokenizerFast’.

**followed by tokenizer config

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is ‘RagTokenizer’.
The class this function is called from is ‘DPRQuestionEncoderTokenizer’.
**followed by tokenizer config

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is ‘RagTokenizer’.
The class this function is called from is ‘DPRQuestionEncoderTokenizerFast’.
Could not locate the tokenizer configuration file, will try to use the model config instead.

yeah these things are normal. You’re all good :slight_smile: