Cannot solve 'DynamicCache'... 'seen_tokens' error!

vergamse · September 15, 2025, 11:16am

Hello Everyone. I am a beginner learning LLMs and got hold of Book by Jay Alammar. I am trying to replicate the code in Colab, given by the author in the first chapter but I am not able to make it work. Looks like the latest version of transformers module had removed some functions and methods. It’s a simple code.

```
# Check the version of the transformers library
import transformers
print("Transformers version:", transformers.__version__)
# output in Colab shows 'Transformers version: 4.56.1'

# It's also good practice to check torch (PyTorch) version
import torch
print("PyTorch version:", torch.__version__)
# output in Colab shows 'PyTorch version: 2.8.0+cu126'

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

#Load Model & Tokenizer
model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct",
    device_map = "auto",
    torch_dtype = "auto",
    trust_remote_code = True,
)

tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")

#Create a pipeline
generator = pipeline(
    "text-generation",
    model = model,
    tokenizer = tokenizer,
    return_full_text = False,
    max_new_tokens = 500,
    do_sample = False
)

# The prompt (user input/query)
messages = [
    {"role": "user", "content": "Create a funny joke about chickens."}
]

# Generate Output
output = generator(messages)
print(output[0]['generated_text'])
```

However, the above code gives me the following error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipython-input-262462900.py in <cell line: 0>()
      5 
      6 # Generate Output
----> 7 output = generator(messages)
      8 print(output[0]['generated_text'])

8 frames
~/.cache/huggingface/modules/transformers_modules/microsoft/Phi-3-mini-4k-instruct/0a67737cc96d2554230f90338b163bc6380a2a85/modeling_phi3.py in prepare_inputs_for_generation(self, input_ids, past_key_values, attention_mask, inputs_embeds, **kwargs)
   1289             if isinstance(past_key_values, Cache):
   1290                 cache_length = past_key_values.get_seq_length()
-> 1291                 past_length = past_key_values.seen_tokens
   1292                 max_cache_length = past_key_values.get_max_length()
   1293             else:

AttributeError: 'DynamicCache' object has no attribute 'seen_tokens'

I tried modifying the code using ChatGPT, deepseek and inbuilt gemini as well, but they weren’t able to solve the problem. One of the solution they presented was to fall back on the transformer version (to 4.36.0), which i believe will not help me in the long term.

What could be the possible solution for this? Is the book really outdated after its release 11 months ago? Please Help! I’m not able to proceed further.

John6666 · September 15, 2025, 12:17pm

Downgrading is fine, but if you want to run it on the latest Transformers, this method might be better. Since PHI-3 should be supported by default now, I don’t think remote_code is necessary for this model anymore…

model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct",
    device_map = "auto",
    torch_dtype = "auto",
   # trust_remote_code = True, <= delete this line to avoid using outdated code
)

vergamse · September 15, 2025, 3:31pm

Thanks a lot. You saved my day. I was having a tough time figuring this out. BTW, what could be the problem with this line of code?

John6666 · September 15, 2025, 9:28pm

what could be the problem with this line of code?

Setting trust_remote_code=True causes the class from the .py file in the Hugging Face model repo to be used, so if that code is outdated, the old code will be used.

It’s useful for new models that aren’t officially supported or for customized models, but it’s unnecessary if the current version provides support in default.

Usually, code rarely becomes unusable due to Transoformers version upgrades, but around version 4.49.0 there was a major refactoring, so function locations changed and errors can occur. I occasionally pin the version myself. pip install transformers<=4.48.3

system · September 16, 2025, 9:29am

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Connection error in transformers 🤗Transformers	0	1014	June 22, 2021
Cannot import name 'LlamaTokenizer' from 'transformers' Beginners	2	5939	September 4, 2023
Why i can't use or can't pass past_key_values = DynamicCache() into Llama 3 model Intermediate	1	303	October 8, 2024
KeyError when loading any dinov2 model 🤗Transformers	1	3107	August 4, 2023
Simple use of Transformers breaks Beginners	1	1402	June 2, 2023

Cannot solve 'DynamicCache'... 'seen_tokens' error!

Related topics