Need Help in creating ai chatbot for my app

ajaypillai · August 8, 2025, 8:39am

Yeah, yesterday I was working with Ollama as Ollama runs a **single inference session per model instance.**When multiple requests hit that same instance, Ollama queues them and processes one at a time — there’s no parallel token generation inside one model. So that’s the drawback of it. So I was thinking to run model locally using libraries like Transformers, vLLM, or Text Generation Inference (TGI).

Topic		Replies	Views
Finetune a chatbot for specfic task Beginners	0	798	June 10, 2023
Answer template generation from question 🤗Transformers	0	215	November 11, 2023
Conversational AI + question answering model Intermediate	5	2686	January 30, 2023
FastLoRAChat Instruct-tune LLaMA on consumer hardware with shareGPT data Show and Tell	0	676	April 19, 2023
Mistral 7B RAG Langchaing Models	0	2642	February 20, 2024

Need Help in creating ai chatbot for my app

Related topics