Help using inference endpoint with Llama 3.1 405B Instruct

Trying to run an example from: https://huggingface.co/blog/llama31#inference-integrations

Works on smaller models, but for 405B the client freezes (waited about 30 min).

Did someone successfully queried this model? (I have a PRO account on HF)

Update: now I got this

Exception has occurred: HfHubHTTPError

429 Client Error: Too Many Requests for url: link) Rate limit reached. You reached PRO hourly usage limit. Use Inference Endpoints (dedicated) to scale your endpoint.

requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: api-inference.huggingface.co/models/meta-llama/Meta-Llama-3.1-405B-Instruct-FP8/v1/chat/completions

I have been trying to query the model, but I have been getting the same error:
Error code: 503 - {‘error’: ‘Service Unavailable’}. The same code works fine for llama 3.1 8B and 70B. I have access to the models and also pro account on HF