Is it possible to run the text-to-image example on cpu?

JonasFranke · July 20, 2024, 2:10pm

Hello! I’m trying to run the diffusers text-to-image training example with CPU instead of a nvidia GPU with CUDA. But im receiving the below error with xFormers which is GPU.

The following values were not passed to `accelerate launch` and had defaults used instead:
        `--num_cpu_threads_per_process` was set to `6` to improve out-of-box performance when training on CPUs
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
    PyTorch 2.3.1+cu121 with CUDA 1201 (you have 2.3.1+cpu)
    Python  3.12.4 (you have 3.12.4)
  Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
  Memory-efficient attention, SwiGLU, sparse and more won't be available.
  Set XFORMERS_MORE_DETAILS=1 for more details
07/20/2024 14:16:33 - INFO - __main__ - Distributed environment: DistributedType.NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cpu

Mixed precision type: no

{'thresholding', 'variance_type', 'rescale_betas_zero_snr', 'dynamic_thresholding_ratio', 'clip_sample_range'} was not found in config. Values will be initialized to default values.
{'latents_std', 'latents_mean', 'use_quant_conv', 'use_post_quant_conv', 'shift_factor'} was not found in config. Values will be initialized to default values.
Traceback (most recent call last):
  File "C:\Users\JonasFranke\Documents\Flaskyi\diffusers\examples\text_to_image\train_text_to_image.py", line 1142, in <module>
    main()
  File "C:\Users\JonasFranke\Documents\Flaskyi\diffusers\examples\text_to_image\train_text_to_image.py", line 645, in main
    unet.enable_xformers_memory_efficient_attention()
  File "C:\Users\JonasFranke\AppData\Local\Programs\Python\Python312\Lib\site-packages\diffusers\models\modeling_utils.py", line 258, in enable_xformers_memory_efficient_attention
    self.set_use_memory_efficient_attention_xformers(True, attention_op)
  File "C:\Users\JonasFranke\AppData\Local\Programs\Python\Python312\Lib\site-packages\diffusers\models\modeling_utils.py", line 222, in set_use_memory_efficient_attention_xformers
    fn_recursive_set_mem_eff(module)
  File "C:\Users\JonasFranke\AppData\Local\Programs\Python\Python312\Lib\site-packages\diffusers\models\modeling_utils.py", line 218, in fn_recursive_set_mem_eff
    fn_recursive_set_mem_eff(child)
  File "C:\Users\JonasFranke\AppData\Local\Programs\Python\Python312\Lib\site-packages\diffusers\models\modeling_utils.py", line 218, in fn_recursive_set_mem_eff
    fn_recursive_set_mem_eff(child)
  File "C:\Users\JonasFranke\AppData\Local\Programs\Python\Python312\Lib\site-packages\diffusers\models\modeling_utils.py", line 218, in fn_recursive_set_mem_eff
    fn_recursive_set_mem_eff(child)
  File "C:\Users\JonasFranke\AppData\Local\Programs\Python\Python312\Lib\site-packages\diffusers\models\modeling_utils.py", line 215, in fn_recursive_set_mem_eff
    module.set_use_memory_efficient_attention_xformers(valid, attention_op)
  File "C:\Users\JonasFranke\AppData\Local\Programs\Python\Python312\Lib\site-packages\diffusers\models\modeling_utils.py", line 222, in set_use_memory_efficient_attention_xformers
    fn_recursive_set_mem_eff(module)
  File "C:\Users\JonasFranke\AppData\Local\Programs\Python\Python312\Lib\site-packages\diffusers\models\modeling_utils.py", line 218, in fn_recursive_set_mem_eff
    fn_recursive_set_mem_eff(child)
  File "C:\Users\JonasFranke\AppData\Local\Programs\Python\Python312\Lib\site-packages\diffusers\models\modeling_utils.py", line 218, in fn_recursive_set_mem_eff
    fn_recursive_set_mem_eff(child)
  File "C:\Users\JonasFranke\AppData\Local\Programs\Python\Python312\Lib\site-packages\diffusers\models\modeling_utils.py", line 215, in fn_recursive_set_mem_eff
    module.set_use_memory_efficient_attention_xformers(valid, attention_op)
  File "C:\Users\JonasFranke\AppData\Local\Programs\Python\Python312\Lib\site-packages\diffusers\models\attention_processor.py", line 310, in set_use_memory_efficient_attention_xformers
    raise ValueError(
ValueError: torch.cuda.is_available() should be True but is False. xformers' memory efficient attention is only available for GPU
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\JonasFranke\AppData\Local\Programs\Python\Python312\Scripts\accelerate.exe\__main__.py", line 7, in <module>
  File "C:\Users\JonasFranke\AppData\Local\Programs\Python\Python312\Lib\site-packages\accelerate\commands\accelerate_cli.py", line 48, in main
    args.func(args)
  File "C:\Users\JonasFranke\AppData\Local\Programs\Python\Python312\Lib\site-packages\accelerate\commands\launch.py", line 1097, in launch_command
    simple_launcher(args)
  File "C:\Users\JonasFranke\AppData\Local\Programs\Python\Python312\Lib\site-packages\accelerate\commands\launch.py", line 703, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['C:\\Users\\JonasFranke\\AppData\\Local\\Programs\\Python\\Python312\\python.exe', 'train_text_to_image.py', '--pretrained_model_name_or_path=stabilityai/sdxl-turbo', '--dataset_name=flaskyi/flaskyi-v1-dataset', '--use_ema', '--resolution=256', '--center_crop', '--random_flip', '--train_batch_size=1', '--gradient_accumulation_steps=4', '--gradient_checkpointing', '--max_train_steps=15000', '--learning_rate=1e-05', '--max_grad_norm=1', '--enable_xformers_memory_efficient_attention', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--output_dir=sd-naruto-model', '--push_to_hub']' returned non-zero exit status 1.

Have anyone got it to work with CPU and how? Can anybody assist?

lgomezbachar · August 26, 2024, 1:57pm

I have the same question!!

nielsr · August 26, 2024, 2:00pm

Hi,

Could you share the code snippet? Models can definitely run on CPU. It seems you’re passing an xformers optimization, which can only run on a GPU.

rakusi · November 4, 2024, 11:10am

The model running on CPU is too slow takes more than 2 hours to generate single image. Is there a way to generate multiple images at once? I am using the following code with these parameter values
{
“prompt”: “A capybara holding a sign that reads Hello World”,
“num_inference_steps”: 28,
“guidance_scale”: 3.5
}

from diffusers import StableDiffusion3Pipeline

pipe = StableDiffusion3Pipeline.from_pretrained(model_name, cache_dir=cache_dir, torch_dtype=torch.float32)
pipe = pipe.to(“cpu”) # Move to CPU
image = pipe(
request.prompt,
num_inference_steps=request.num_inference_steps,
guidance_scale=request.guidance_scale,
).images[0]

Topic		Replies	Views
Diffusers text-to-image finetuning example fails on multi-node 🧨 Diffusers	2	699	March 30, 2023
Accelerate doesn't seem to use my GPU? 🤗Accelerate	7	5713	September 18, 2024
Stable diffusion `train_text_to_image.py` only on one gpu 🧨 Diffusers	5	1191	May 2, 2023
Creation of Images from Text-Prompt Beginners	1	80	December 31, 2024
Diffusers load custom embedding 🧨 Diffusers	0	48	November 7, 2024

Is it possible to run the text-to-image example on cpu?

Related topics