Inference API budget, billing limit

Hi, it would be nice to be able to limit inference API spending.

I like the simple OpenAI system: soft and hard limit.

Would be useful, especially when someone makes the mistake of an infinite loop calling the API :stuck_out_tongue: