Optimizing Model Loading with a CPU Bottleneck

John6666 · August 17, 2025, 3:23am

I was hoping I could check with maintainers to actually add the safetensors (which is still an open PR on it) and then the sharded version too.

It seems that a PR has already been opened regarding that. All that’s needed is for the maintainer to merge it, but I guess it’s been forgotten…

Topic		Replies	Views
Loading model directly to GPU omitting RAM Beginners	6	84	March 28, 2025
General question about large model loading 🤗Accelerate	2	930	November 28, 2024
Accelerate use of memory 🤗Transformers	1	117	February 7, 2025
Why am I out of GPU memory despite using device_map="auto"? 🤗Accelerate	3	18412	March 18, 2024
Can't load huge model onto multiple GPU's Beginners	5	5265	June 15, 2023