Phi-3-mini-128k-instruct not working with pro inference api

Using this endpoint:

I get this error:

Error code: 500 - {‘error’: ‘The repository for microsoft/Phi-3-mini-128k-instruct contains custom code which must be executed to correctly load the model. You can inspect the repository content at\nPlease pass the argument trust_remote_code=True to allow custom code to be run.’}

client.list_deployed_models() shows that the model is deployed. The 4k version works fine.


hi @sam-paech ,
Looking at the available models on TGI, currently, only microsoft/phi-3-mini-4k-instruct is available. I can check if we have plans to support the other models.

Running this code:

from huggingface_hub import InferenceClient
client = InferenceClient(token=MYTOKEN)

Returns this list for deployed text-generation models, which includes the 128k model:

‘text-generation’: [‘b3ck1/gpt-neo-125M-finetuned-beer-recipes’,

yes, it’s available with transformers, however, still incomplete the inference doesn’t support custom remote code.

And here is the list with all models that are “warm”

If you need to deploy Phi-3-mini-128k-instruct as inference endpoint you’ll need a custom handler to support trust_remote_code=True

This list doesn’t contain some models that are listed as deployed & are working with the api, e.g. command-r-plus.

I only just signed up for the pro subscription so I’m not sure how it’s supposed to work. But I would have assumed the list returned by


would represent the models that are available for inference with the API. If that isn’t the case, is there a way to get an authoritative list of deployed + working models?

Oops I was wrong, your list does contain command-r-plus.

Thanks for the info! I was also trying to deploy phi3 models on a dedicated endpoint, and the custom handler seems to be the only current solution.

Is there a similar list for which models are currently supported on dedicated inference endpoints (without requiring a custom handler)?
Trying to deploy microsoft/phi-3-mini-4k-instruct on one gets me a similar error about trust_remote_code in the logs.

Also, according to the model page,

Phi-3 has been integrated in the development version ( of transformers .

Is it then a reasonable assumption that once dedicated inference endpoints use a transformers version > 4.40.0, then microsoft/phi-3-mini-4k-instruct will be deployable on dedicated inference endpoints through TGI? Is it possible to see which version is currently used?


Yes, the list of default libraries can be found here: (includes everything except TGI version, team is going to fix that).

Phi-3 has indeed been integrated natively in the Transformers library: transformers/src/transformers/models/phi3/ at main · huggingface/transformers · GitHub, which means that you can now load it without having to specify trust_remote_code=True.


Hi, @nielsr,

I’m a bit confused by your answer. I tried to create a dedicated inference point with microsoft/Phi-3-mini-4k-instruct, but it failed with an error saying I need to specify trust_remote_code=True.


That’s probably because the current Transformers version of Inference Endpoints is 4.38.2 as per the doc here: Hence it will only be possible once this updates to Transformers v4.40.

1 Like