MPS is running slower than CPU on Mac M1 Pro

Hello everyone.

I have been recently testing the new version 0.3.0 on my M1 Pro but I found that following the steps from How to use Stable Diffusion in Apple Silicon (M1/M2) the execution times for CPU and MPS are on average for similar prompts:

  • GPU: 331 s
  • CPU: 222 s

Has anyone tested it too ?

Hi @polodealvarado! Your CPU numbers are very similar to the ones I get in my M1 Max, but as reported in the page you mentioned, the speed I see is much faster when using the GPU. Would you mind sharing a couple of details so I can try to take a look? These would be useful:

  • The amount of RAM your computer has.
  • The version of PyTorch you installed.
  • Your macOS version.
  • A small code snippet, only if you made any changes to the example we provided.

Thanks a lot!

HI! @pcuenq, thank you for answering.

Here you have all the details and more:

  • RAM: 16 GB
  • GPU cores: 16
  • macOS version: 12.5.1
  • Python version: 3.9.13
  • Diffuser version: 0.3.0
  • Torch version: 1.13.0.dev20220908

I have been using the same code without touching it. On the other hand, I tried another jupyter notebook from this repository and the results are quite similar (cpu works better than mps).

1 Like

I am following this thread, running mps backend. @pcuenq

1 Like

That’s a very interesting thread! They specifically say that random operations are not yet optimized; however, diffusers’ code generates random latents in CPU when using the mps device.

I’ll do some testing, thanks!

1 Like