Hello Everyone,
I have been trying to Optimize the model (Text Classification, BERT) for GPU using ORTOptimizer
but first, I got this warning message:
[W:onnxruntime:Default, onnxruntime_pybind_state.cc:578 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/reference/execution-providers/CUDA-ExecutionProvider.html#requirements to ensure all dependencies are met. 2022-11-22 13:46:00.800872459 [W:onnxruntime:, inference_session.cc:1458 Initialize] Serializing optimized model with Graph Optimization level greater than ORT_ENABLE_EXTENDED and the NchwcTransformer enabled. The generated model may contain hardware-specific optimizations, and should only be used in the same environment the model was optimized in.
Then I faced an error:
assert "CUDAExecutionProvider" in session.get_providers() # Make sure there is GPU 107 assert os.path.exists(optimized_model_path) and os.path.isfile(optimized_model_path) 108 logger.debug("Save optimized model by onnxruntime to {}".format(optimized_model_path))
Environment check:
get_available_providers()
output: [‘TensorrtExecutionProvider’, ‘CUDAExecutionProvider’, ‘CPUExecutionProvider’]