Authorization header in inference API

Hi,
I am just wondering what is the purpose of the “Authorization” http header in the inference API. If I remove this header, the request is still working. Example:

curl 'https://api-inference.huggingface.co/models/bert-base-uncased' \
  -H 'Connection: keep-alive' \
  -H 'Content-Type: text/plain;charset=UTF-8' \
  -H 'Accept: */*' \
  -H 'Accept-Language: en,en-US;q=0.9,id;q=0.8,de;q=0.7,ms;q=0.6' \
  --data-binary '{"inputs":"If I am hungry, I will make [MASK]."}' \
  --compressed

@Narsil and @jeffboudier can chime in, but we offer a certain number of non-authed requests, with IP-based rate limiting.

For production workloads you’ll need a token.

Hi Cahya! The main purpose of the Authorization http header is to authenticate commercial customers of our Hosted Inference API subscriptions, for production workloads that require models to be preloaded / always available, to enable accelerated inference on CPU and/or GPU, and access to private models. If this service is of interest for your organization, please reach out!

Thanks @jeffboudier and @julien-c for the explanation.