The datasets.map function does not load cached dataset

Thanks for your reply! I am sure that all of the parameters are not changed. Actually, I have tried to figure out the reason and found an interesting thing. The map function indeed loads the processed datasets if I changed nothing. However, if I copy the codes to another .py file and run it, the datasets are processed again. But I suppose the second processing should not happen.

The following is the code snippet, I changed nothing but run it in different files.

if __name__ == '__main__':
    from datasets import load_dataset
    from transformers import AutoTokenizer
    raw_datasets = load_dataset("wikitext", "wikitext-2-raw-v1")

    tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")


    def tokenize_function(examples):
        return tokenizer(examples["text"], padding="max_length", truncation=True)


    tokenized_datasets = raw_datasets.map(tokenize_function, batched=True)