This worked for me:
llm = AutoModelForCausalLM.from_pretrained("TheBloke/zephyr-7B-beta-GGUF",
model_file="zephyr-7b-beta.Q5_K_M.gguf",
model_type="mistral",
gpu_layers=50,
max_new_tokens = 1000,
context_length = 6000)
No warnings output.