Inference Endpoints creation

coreaiteam · December 5, 2023, 11:22am

I have finetuned a LLM model on my custom data using peft and lora on Autotrain Advance UI.
Now how to create a inference endpoint after finetuning it?
If anyone have idea about it or have done this before please share it.

iarbel · January 14, 2024, 11:05am

+1 for that. text-generation-inference supports loading a PeftModel directly with the adapter, without the need to merge weights before. I’d expect the Inference Endpoints to support this option, however I couldn’t find a way to do it. Specifically for quantized models this is very important, as it’s not possible to merge LoRA weights into the base

Topic		Replies	Views
PEFT + Inference Inference Endpoints on the Hub	3	955	January 15, 2024
Llama/Mistral Finetuning for Inference API Models	0	169	March 30, 2024
Model won't load on custom inference endpoint Inference Endpoints on the Hub	2	360	June 13, 2024
RuntimeError on trying to create Inference Endpoint Inference Endpoints on the Hub	0	221	August 2, 2023
Inference after QLoRA fine-tuning Intermediate	8	6278	June 7, 2024

Inference Endpoints creation

Related topics