Hello, have you found any guides that were useful to you? We need to write a handler.py file for the mistral7b model that we finetuned using unsloth to deploy on an inference endpoint.
Hi @DisgustingOzil, we had the error due to incorrect procedure, instead of merging the adapter weights with the base model, we just uploaded the adapter weights directly and tried to deploy that instead.
After reading the following , Config.json is not saving after finetuning Llama 2 - #5 by hemanthkumar23, by using the merge_and_unload() function, we were able to upload the complete model that was deployable. Unfortunately I don’t know how to reduce the size of the merged model. Hope this helps!