PEFT + Inference

Is there a way to get PEFT to work with inference endpoints?

Ideally, we should be able to support multiple PEFT models with a common inference endpoint for the base model.

2 Likes

any updates here?

You could configure a custom handler that allows you to specify code to load the model and its adapters Create custom Inference Handler