TGI with Qwen 2.5 Coder 7B base

desaxce · November 24, 2024, 7:39pm

Looking at the list of supported models for TGI, I do not see Qwen 2.5 listed.

Could I host an inference server for a fine-tuned model whose base is Qwen 2.5 Coder 7B in BF16 using the TGI offer?

John6666 · November 25, 2024, 3:17am

I think it’s probably just that the manual hasn’t been updated. Qwen 2.5 works with the Serverless Inference API and is the most popular.
On the other hand, it is currently almost impossible to run models that you have improved yourself with the Serverless Inference API. I think it’s because the HF server doesn’t have enough resources.
Of course, there is no problem with running it locally or using it from Spaces or other virtual environments after placing it on HF.

nielsr · November 25, 2024, 9:31am

Yes Qwen 2.5 is supported, since it is identical to Qwen 2.

This is similar to the llama models (the docs only list llama, but it supports any version of llama).

Topic		Replies	Views
How to deploy a T5 model to AWS SageMaker for fast inference? Amazon SageMaker	13	5814	February 28, 2022
Performance of hosted inference API Beginners	0	296	February 16, 2021
Host gpt2 model in a browser 🤗Transformers	1	592	January 19, 2021
Inference API has been turned off for this model Beginners	0	992	June 6, 2023
Help for inference.py code Amazon SageMaker	10	4009	March 8, 2022

TGI with Qwen 2.5 Coder 7B base

Related topics