Pass CPU cores to speed up inference

Talha · June 13, 2022, 1:06pm

Hi, I am looking can we speedup inference process by doing multi thread/ multi process on cpu with optimum onnx runtime

philschmid · June 14, 2022, 7:32am

By default optimum with onnxruntime will use all available cores. If you want to adjust this, e.g. use multiprocessing and optimum in parallel you can adjust the used THREADS/CPU cores with onnxruntime.SessionOptions by configuring inter_op_num_threads and intra_op_num_threads

Topic		Replies	Views
Onnx Vs Optimum 🤗Optimum	1	1347	June 28, 2022
Transformers.onnx vs optimum.onnxruntime 🤗Optimum	1	1125	September 12, 2022
Is there any way to avoid CPU bottlenecks when doing single prompt inference? Intermediate	1	969	June 12, 2023
High variability of CPU inference times Beginners	4	44	January 30, 2025
How to successfully ONNX pretrained models Beginners	7	232	January 26, 2025

Pass CPU cores to speed up inference

Related topics