Performance of hosted inference API

For any one fine-tuned GPT-2 model, would like to understand the performance (response time and number of concurrent requests that can be served) of the hosted inference API of Hugging Face in ‘Lab’ mode and ‘Startup’ mode?