Trying to create a basic inference on Sagemaker, but can see that the model downloads 30 -40% and then the download restarts again and the loop keeps on happening . And after 20 -30 mins it just fails.
Tried with different instances as well, but still the same issue persists.
Any help would be really appreciated.
Here is he exact code,
hub = {
‘HF_MODEL_ID’:‘EleutherAI/gpt-j-6B’,
‘HF_TASK’:‘text-generation’
}
create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
transformers_version=‘4.6.1’,
pytorch_version=‘1.7.1’,
py_version=‘py36’,
env=hub,
role=role,
)
deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1, # number of instances
instance_type=‘ml.m5.4xlarge’ # ec2 instance type
)
predictor.predict({
‘inputs’: "Can you please let us know more details about your "
})