Backend low level kernel libraries used in Transformers

Hello @ArthurZ @RaushanTurganbay ,
What are the backend libraries used for low level kernels operations(matmul, softmax etc) in transformers library?
Issue: If I ran same model(say mamba or llama) on x86 machine and aarch64 machine, I observe difference in the model timing. I suspect there are different paths for kernels in x86 and aarch64.
Please specify the backend libraries used for x86 and aarch64

1 Like

In Transformers the attention is by default torch.nn.sdpa() so the backend libraries would be handled Pytorch

1 Like

@RaushanTurganbay,
Yes pytorch is evident. But we need to go another level down i.e what libraries do torch calls for operations? As we observe difference in timings on x86 and arm, we can find bottleneck only if we know low level kernel library

1 Like

I also face this issue. but fortunately i saw your thread. Thanks for giving solution.

1 Like