Feature Suggestion! running large gguf models!

ct-2 · December 3, 2023, 6:02am

Would it be possible to run a public facing huggingface space to to run large models on cpu? Possible to have a quick look at whether the model will run before people download.

The vram for them can get pretty high, but when paired with additional memory, it still runs, which would be good enough as a test.

running with RAM is only the speed of the RAM, but consider the half of the model is loaded to RAM, and half to vram. If vram is instantaneous then RAM is now twice as fast as the model running in only RAM.

It would be a helpful addition for loading 100GB models, for testing.

Topic		Replies	Views
Determining if a model will run locally Beginners	4	493	April 7, 2025
124gb vram model recommendation Beginners	1	37	June 12, 2025
Hosting a HF Space for Ultra-Large Language Models Spaces	2	1306	December 15, 2021
How can I search for models, sorted in order of required vram? Site Feedback	0	402	January 28, 2023
How to find models that work on low memory/CPU edge devices Models	3	871	October 17, 2024

Feature Suggestion! running large gguf models!

Related topics