Does pipline with accelerate use "with init_empty_weights():"?

i am using pipeline to infer LLMs with 7B to 70B on kaggle and my local setup.

i saw the application of accelerate for infering large models. and i saw the base.py and utils of pipeline from huggigface.

i didn’t find the “with init_empty_weights():” which is used for infering big models.

does it not use it or it is used but nt written in those files.

if it not written, can i load model with “with init_empty_weights():” and pass that model in pipeline with same device as i load using load_checkpoint_and_dispatch() and tie_weight()?

Hi @vivek9840, you can’t do the following:

if it not written, can i load model with “with init_empty_weights():” and pass that model in pipeline with same device as i load using load_checkpoint_and_dispatch() and tie_weight()?

However, you can pass device_map=“auto” in pipeline. It will use init_empty_weights and more to load your model on your gpu(s). With this arg, you will be able to perform big model inference as described Handling big models for inference.

thanks for your reply. i was also curious since i didn’t see the line where it uses init_empty_weights. i tried to see the dile in pipeline and accelerate but i didn’t saw the use of init_empty_weights.

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.