How to run Phi-1_5 on cpu?

szymonrucinski · September 18, 2023, 1:25pm

I was wondering how can I run inference on CPU. It is not possible with pytorch and pipelines and Llama CPP. From what I understood it is a mixformers architecture that is not supported. Do you have any ideas?

Karras10 · December 30, 2023, 11:00pm

Hello there! Just bumped into your question while I was Googling Phi model. Right now I’m using Phi 2 on CPU with Rust language and Candle framework. It is slow but it works! You can give it a try if you still can’t manage to run it on CPU.

Topic		Replies	Views
How to run Phi3 with candle-onnx/Rust? Beginners	2	759	May 22, 2024
Asynchronous CPU-GPU computation Beginners	0	346	March 15, 2024
On cpu, how to save memory when inferencing? 🤗Transformers	1	616	July 13, 2023
Inference is slow on M1 Mac despite MPS Torch backend Beginners	4	3613	May 26, 2024
How to run on CPU? 🤗Transformers	1	10147	April 1, 2022

How to run Phi-1_5 on cpu?

Related topics