Offloading LLM models to CPU uses only single core

I’m also having same problem with Mistral 7B, you can try BetterTransformer