How to avert 'loading checkpoint shards'?

Hello, I have downloaded the model to my local computer in hopes of it would help me avoid the dreadfully slow loading process. Sadly it didn’t work as intend with the demo code. Is hat possible, and if so how can I adapt the code to do it?

from transformers import T5Tokenizer, T5ForConditionalGeneration

import torch


tokenizer = T5Tokenizer.from_pretrained("LOCAL_PATH")

model = T5ForConditionalGeneration.from_pretrained("LOCAL_PATH", device_map="auto")

input_text = "INPUT"

input_ids = tokenizer(input_text, return_tensors="pt")"cuda")

outputs = model.generate(input_ids)

print(tokenizer. Decode(outputs[0]))