Custom containers - setting args

p-christ · November 7, 2023, 8:08am

I’m trying to use a custom container for inference but I cannot see anywhere how to set the args for the query that starts the container’s serving? Anyone know what i am missing?

e.g. i need to set things like this:

    f"--model={model_id}",
    f"--tensor-parallel-size={accelerator_count}",
    "--swap-space=16",
    f"--dtype={dtype}",
    "--gpu-memory-utilization=0.9",

Topic		Replies	Views
Guidelines for using a Custom Docker Image Inference Endpoints on the Hub	9	1777	May 23, 2024
Adapting a model from Spaces to Inference Endpoint Inference Endpoints on the Hub	3	1997	November 25, 2022
How to access binary files in for custom inference endpoints? Inference Endpoints on the Hub	1	277	November 6, 2023
Inference endpoint deployment with custom dockerfile Inference Endpoints on the Hub	2	820	April 23, 2024
How to have custom output size for inference API Inference Endpoints on the Hub	4	1313	February 16, 2023

Custom containers - setting args

Related topics