Lazy model initialization

arteagac · July 27, 2022, 6:15pm

Hello. How can I create a model object and skip the random initialization of weights? The random initialization is time consuming and unnecessary for my case, as I want to load the weights using torch.load_state_dict. For instance, see the code below.

config = BloomConfig.from_pretrained("bigscience/bloom")
block = BloomBlock(config)  # initializes weights randomly, which is time consuming
block.load_state_dict(torch.load("path_to_pytorch_bin"))

Fazzie · May 7, 2024, 7:56am

same question

ayushgoel · May 8, 2024, 10:12pm

Use the init_empty_weights ContextManager from accelerate

from accelerate import init_empty_weights
with init_empty_weights():
   block = BloomBlock(config)

nielsr · May 8, 2024, 10:27pm

You can set low_cpu_mem_usage=True which will skip that: Models

Topic		Replies	Views
Trainer API weights initialization 🤗Transformers	2	76	February 10, 2025
How to initialize a model with random weights Beginners	3	953	October 28, 2024
Initializing a big model on GPU with random weights 🤗Transformers	2	79	January 14, 2025
Does pipline with accelerate use "with init_empty_weights():"? 🤗Accelerate	3	236	April 15, 2024
Loading a trained model gives an error that weights are randomly initialized 🤗Transformers	0	474	June 6, 2023

Lazy model initialization

Related topics