Segfault during PyTorch + Transformers inference on Apple Silicon M4 (libomp.dylib crash on LayerNorm)

Hi all,

On Apple Silicon M4 (macOS 15.5, 24GB RAM, torch 2.7.1 + transformers 4.41.2), I’ve encountered a reproducible segmentation fault when running inference on the model dccuchile/bert-base-spanish-wwm-cased using CPU execution.

The model loads fine via from_pretrained(), but actual inference (model(input_ids, attention_mask)) triggers a crash. After tracing the issue via LLDB, I’ve confirmed the fault originates in libomp.dylib during thread suspension inside libtorch_cpu.dylib while executing LayerNorm.

:white_check_mark: Reproducible on:

  • MacBook Pro with Apple M4, 24GB RAM, 1TB SSD
  • Python 3.11.13 (Homebrew)
  • torch 2.7.1 and transformers 4.41.2
  • CPU inference only (no GPU, no MPS)
  • dccuchile/bert-base-spanish-wwm-cased, but issue may generalize

:magnifying_glass_tilted_right: Not reproducible:

  • On Intel Macs
  • When only loading the model (no forward pass)

:paperclip: Attached ZIP package (via iCloud):

  • Scripts: IFA_app.py, repro_beto_loader.py
  • Full terminal logs (with and without crash)
  • pip freeze, system info
  • LLDB symbolic backtrace (lldb_backtrace_IFA_app.txt)
  • README (EN + ES)

:package: Download here:
(iCloud Drive - Apple iCloud)

Would love to know if others on M4 can replicate this, or if there’s known instability around OpenMP / LayerNorm on Apple Silicon CPUs.

Appreciate any insights!
Thanks,
Juan Alberto Ignacio Videla
Buenos Aires, Argentina

1 Like

:counterclockwise_arrows_button: Update: This issue has now also been reported on GitHub for broader visibility and tracking:

:paperclip: GitHub Issue #39020 – Segfault on Apple M4 using AutoModelForSequenceClassification with BETO model on CPU

Includes full trace, LLDB backtrace, and Apple Feedback ID FB18354497.
Happy to collaborate with anyone experiencing similar behavior or investigating libomp / LayerNorm interactions on Apple Silicon.

1 Like

I can’t open iCloud…:sweat_smile:

In any case, I don’t think the cause is Transoformers, as it doesn’t usually cause a SegFault. I think the cause is PyTorch (especially version 2.3, which should not be used in practice…) or the underlying environment. For example, in the case of Apple MPS, there may be compatibility issues with iPython or Jupyter or so.
https://stackoverflow.com/questions/71338821/segmentation-fault-python-after-import-torch-on-mac-m1

https://stackoverflow.com/questions/77812375/pytorch-error-on-mps-apple-silicon-metal

Hi John, 
Many Thanks for your reply!

You're right — Transformers itself is probably not the root cause. 
1 Like

It’s unusual that this happens in CPU mode rather than MPS… It’s likely a problem with a library in PyTorch that is close to the hardware. It seems that if your PyTorch build is not fairly recent, you may encounter problems with libomp.

Hi [John / Nikita / team],

Thank you for your responses and for following up. I confirm that I’m working on an environment running Apple Silicon M4, and I was using PyTorch 2.3.0 along with torchvision 0.18.1 — both installed via PyPI under the ā€œstableā€ label.

Following your suggestions, I’ll be upgrading to PyTorch 2.7.1 and checking compatibility with the appropriate torchvision version (likely 0.18.3 or newer). I’ll also reinstall libomp to ensure there are no low-level conflicts related to CPU parallelism.

I’ll run tests over the next few days and share the results. Given that the M4 chip is relatively new, I understand these reports might be helpful for future compatibility validation.

Thanks again for the support — I’ll stay in touch with any updates.

Best regards, Juan Alberto Ignacio Videla

Buenos Aires - Argentina

1 Like