Hi, I created a custom handler that loads one of the Llama2’s, combines it with PEFT weights, and was hoping I could deploy the resulting model to an inference endpoint. However I ran into permission issues, even having access to Llama2 (which is gated). I’m wondering if I’m missing something, and would appreciate any help with this. I browsed through the docs but couldn’t find anything specific to this scenario. Thanks in advance!
First questions is did you request access to the LLaMA2 model from META? And if so did you do it with the same email as you HuggingFace account? Once META grants access it should let you pass the Gate within HF.
Beyond that can you share any of the code and or errors you’re getting? Are you trying to deploy a HuggingFace Endpoint or SageMaker Endpoint or something else?