Endpoint Deployment

Hello everyone,

I deployed my BERT classification model for batch jobs on Sagemaker with

create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   model_data=model_uri, # configuration for loading model from Hub
   role=role, # iam role with permissions to create an Endpoint
   transformers_version="4.6", # transformers version used
   pytorch_version="1.7", # pytorch version used
   py_version='py36', # python version used

# create Transformer to run our batch job
batch_job = huggingface_model.transformer(
    output_path=output_s3_path, # we are using the same s3 path to save the output with the input

# starts batch transform job and uses s3 data as input

utput_file = f"{dataset_jsonl_file}.out"
output_path = s3_path_join(output_s3_path,output_file)

# download file

batch_transform_result = []
with open(output_file) as f:
    for line in f:
        # converts jsonline array to normal array
        line = "[" + line.replace("[","").replace("]",",") + "]"
        batch_transform_result = literal_eval(line) 

Anyways, when I want to predict a new batch it feels like I always have to start my Notebook on Sagemaker Studio etc. Is there a way to create an API which I can feed with my data to predict from the outside? Is there any cool tutorial or anything? Thanks in advance

Hey @marlon89,

I sadly don’t have yet a tutorial or a sample for it. I am looking into creating something like that in the next weeks, with cdk support.
I created an architecture of how this API can look like.

You can basically leverage AWS Lambda with S3 Trigger to start an create you batch transform jobs after a file Is uploaded to a prefix on an s3 bucket.