Cohere Command-r-plus-(4bit) not deployable

Hi :wave:

I am currently not able to deploy CohereForAI/c4ai-command-r-plus. I get the error:

Server message:Endpoint failed to start. vice GPU Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 1128, in from_pretrained config_class = CONFIG_MAPPING[config_dict["model_type"]] File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 825, in __getitem__ raise KeyError(key) KeyError: 'cohere' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 732, in lifespan async with self.lifespan_context(app) as maybe_state: File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 608, in __aenter__ await self._router.startup() File "/usr/local/lib/python3.10/dist-packages/starlette/routing.py", line 709, in startup await handler() File "/app/webservice_starlette.py", line 60, in some_startup_task inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK) File "/app/huggingface_inference_toolkit/handler.py", line 54, in get_inference_handler_either_custom_or_default_handler return HuggingFaceHandler(model_dir=model_dir, task=task) File "/app/huggingface_inference_toolkit/handler.py", line 18, in __init__ self.pipeline = get_pipeline( File "/app/huggingface_inference_toolkit/utils.py", line 276, in get_pipeline hf_******** = pipeline( File "/usr/local/lib/python3.10/dist-packages/transformers/pipelines/__init__.py", line 815, in pipeline config = AutoConfig.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 1130, in from_pretrained raise ValueError( ValueError: The checkpoint you are trying to load has model type `cohere` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date. Application startup failed. Exiting.

It looks like the runtime of inference endpoints is not upt to data.
I also posted this issue here bc I was not sure where to post.

Best,
Hagen