Want to host a production level server for runnin llm for code generation

kavanbhavsar6 · January 7, 2025, 4:35pm

I am planning to host a LLm model on a gpu vps (nvidia l40s) for my software development company for code generation, I am planning to use qwen-2.5-coder-Instruct 32b. I looked up a few hosting server libraries, like TGI_trtllm, would love to get a guide who could teach me how to set up the model for TGI_trtllm backend as i am new to the AI field. I tried running it using TGI but faced a few issues with TGI.Also if there are any other better models for code generation pls feel free to suggest them.

I also want the llm to be able to be used by the Continue.dev extension in vscode.

Thanks.

Topic		Replies	Views
Experience with and extending LLM for software engineering Intermediate	4	485	August 15, 2024
Automating .NET C# Code Generation with LLMs Beginners	2	537	June 26, 2025
Deploying LLM in Production: Performance Degradation with Multiple Users 🤗Transformers	6	4744	June 7, 2024
Model Deploy On-prem Beginners	1	789	March 21, 2024
TGI with Qwen 2.5 Coder 7B base Beginners	2	157	November 25, 2024

Want to host a production level server for runnin llm for code generation

Related topics