@philschmid Thank you for checking in. I never figured out why the batch transform Hub Model configuration was causing problems but I was able to get around this issue by downloading the model directly, compressing it, and then uploading it to S3
Maybe the versions of PyTorch and Transformers that the expedited batch transform Hub Model configuration were just incompatible with the versions that my model had been trained using. I’m not sure though and still a bit confused since I don’t this wouldn’t explain why I couldn’t get it to work for the model used in your example
@kjackson code for the workaround above is below in case it’s helpful
- get model data
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
MODEL = 'xxxxxxxxxx/xxxxxxxxxxxx'
model = AutoModelForSeq2SeqLM.from_pretrained(MODEL)
tokenizer = AutoTokenizer.from_pretrained(MODEL)
model.save_pretrained('model_token')
tokenizer.save_pretrained('model_token')
- compress and save to new folder in notebook directory
!cd model_token && tar zcvf model.tar.gz *
!mv model_token/model.tar.gz ./model.tar.gz
- upload compressed model to session s3 bucket
import sagemaker
from sagemaker.s3 import S3Uploader,s3_path_join
# get the s3 bucket
sess = sagemaker.Session()
role = sagemaker.get_execution_role()
sagemaker_session_bucket = sess.default_bucket()
# uploads a given file to S3.
upload_path = s3_path_join("s3://",sagemaker_session_bucket,"lab1_model")
print(f"Uploading Model to {upload_path}")
model_uri = S3Uploader.upload('model.tar.gz',upload_path)
print(f"Uploaded model to {model_uri}")
%store model_uri