How to use llm (access fail)

About your second question:
You need to call the pipeline with model='./path_for_local_model':

pipeline = transformers.pipeline(
    "text-generation",
    model='./path_for_local_model',
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)