OK got it - so in some cases the inference API may not be available, but that’s not something that I as an end user will have any control over.
In general though, it sounds like the intended workflow is develop on the inference API where it’s available, and then graduate to endpoint API when you require the production solution. Thanks again for the responses!