About your second question:
You need to call the pipeline with model='./path_for_local_model'
:
pipeline = transformers.pipeline(
"text-generation",
model='./path_for_local_model',
model_kwargs={"torch_dtype": torch.bfloat16},
device_map="auto",
)