Hi, I am looking can we speedup inference process by doing multi thread/ multi process on cpu with optimum onnx runtime
Hello @Talha,
By default optimum with onnxruntime will use all available cores. If you want to adjust this, e.g. use multiprocessing
and optimum in parallel you can adjust the used THREADS/CPU cores with onnxruntime.SessionOptions by configuring inter_op_num_threads
and intra_op_num_threads
1 Like