Why does hugging face falcon model use mode.config.use_cache = False, why wouldn't it want to have the decoder re-use computations for fine-tuning?

0sunfire0 · July 7, 2023, 7:12pm

Most of us use free colab and kaggle gpus, so yeah we need to save as much vram as we can to be able to finetune a pretrained model, but if some1 use A100 , so yes i believe the cache should be truned on, also upcast the model norms layers float 32 for better training stability

Topic		Replies	Views
Why does the falcon QLoRA tutorial code use eos_token as pad_token? Models	19	7832	January 17, 2024
Saving Fine-tune Falcon Model Beginners	0	40	July 15, 2024
Unable to load fine-tuned llm Beginners	4	3276	January 31, 2024
Resolving "Cannot Perform Fine-Tuning on Purely Quantized Models" Error in Falcon Model Training? 🤗Transformers	4	9381	May 9, 2025
Fine Tuning Falcon7B with QLora 🤗Transformers	5	1087	November 21, 2023

Why does hugging face falcon model use mode.config.use_cache = False, why wouldn't it want to have the decoder re-use computations for fine-tuning?

Related topics