Very long warning when running rag-end2end-retriever

yicxu · August 26, 2022, 8:20pm

I’m trying to run rag end2end retriever as here:

I encountered very long warning sequences with many tokenizer mismatches. Is this normal? I pasted a part of it - there are many more following this.

My environment:
pytorch:1.12.1-cuda11.3-cudnn8
pytorch-lightning==1.6.4
transformers==4.21.2

@shamanez

Some of the output:

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is ‘DPRQuestionEncoderTokenizer’.
The class this function is called from is ‘DPRContextEncoderTokenizerFast’.

**followed by tokenizer config

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is ‘RagTokenizer’.
The class this function is called from is ‘DPRQuestionEncoderTokenizer’.
**followed by tokenizer config

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is ‘RagTokenizer’.
The class this function is called from is ‘DPRQuestionEncoderTokenizerFast’.
Could not locate the tokenizer configuration file, will try to use the model config instead.

shamanez · August 26, 2022, 10:43pm

yeah these things are normal. You’re all good

Topic		Replies	Views
Trying RAG with other Retriever Models 🤗Transformers	0	427	January 21, 2021
Transformers suddenly complaining about pytorch? 🤗Transformers	2	8571	August 3, 2021
Huggingface classification struggling with prediction 🤗Transformers	0	833	April 5, 2022
Token tensors arent of same length Beginners	0	310	July 17, 2023
Rag model set up 🤗Transformers	0	696	November 7, 2023

Very long warning when running rag-end2end-retriever

Related topics