Inference Issue with Llama Models using HF Inference

When loading a trained model (difference), it seems that an error occurs because it needs to refer to the original model. In short, it can be fixed by passing a token. There are several ways to do this, such as passing it directly with token= or executing login() in advance.