A. Batch transform on 1M rows
2022-03-25 06:34:31,078 [WARN ] W-model-1-stderr com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Token indices sequence length is longer than the specified maximum sequence length for this model (528 > 512). Running this sequence through the model will result in indexing errors
2022-03-25 06:34:31,092 [WARN ] W-model-1-stderr com.amazonaws.ml.mms.wlm.WorkerLifeCycle - /codebuild/output/src257227288/src/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [249,0,0], thread: [32,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
2022-03-25 06:34:31,106 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Prediction error
2022-03-25 06:34:31,107 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Traceback (most recent call last):
2022-03-25 06:34:31,107 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.6/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py", line 222, in handle
2022-03-25 06:34:31,107 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - response = self.transform_fn(self.model, input_data, content_type, accept)
2022-03-25 06:34:31,107 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.6/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py", line 181, in transform_fn
B. Batch transform on smaller dataset (10K rows)
2022-03-25T07:17:24.961:[sagemaker logs]: sagemaker-us-east-2-460XXXXXXX64/batch_transform/input_head/oot_head_data.jsonl: ClientError: 400
2022-03-25T07:17:24.961:[sagemaker logs]: sagemaker-us-east-2-460XXXXXXX64/batch_transform/input_head/oot_head_data.jsonl:
2022-03-25T07:17:24.961:[sagemaker logs]: sagemaker-us-east-2-460XXXXXXX64/batch_transform/input_head/oot_head_data.jsonl: Message:
2022-03-25T07:17:24.961:[sagemaker logs]: sagemaker-us-east-2-460XXXXXXX64/batch_transform/input_head/oot_head_data.jsonl: {
2022-03-25T07:17:24.961:[sagemaker logs]: sagemaker-us-east-2-460XXXXXXX64/batch_transform/input_head/oot_head_data.jsonl: "code": 400,
2022-03-25T07:17:24.961:[sagemaker logs]: sagemaker-us-east-2-460XXXXXXX64/batch_transform/input_head/oot_head_data.jsonl: "type": "InternalServerException",
2022-03-25T07:17:24.961:[sagemaker logs]: sagemaker-us-east-2-460XXXXXXX64/batch_transform/input_head/oot_head_data.jsonl: "message": "CUDA error: device-side assert triggered"
2022-03-25 07:17:24,943 [WARN ] W-model-1-stderr com.amazonaws.ml.mms.wlm.WorkerLifeCycle - /codebuild/output/src257227288/src/aten/src/ATen/native/cuda/Indexing.cu:658: indexSelectLargeIndex: block: [251,0,0], thread: [35,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
The code that I am using to create batch transform is as follows
sagemaker_session_bucket = sess.default_bucket()
df_processed_oot['inputs'] = df_processed_oot['detailedissue']
df_processed_oot[['inputs']].head(10000).to_csv(config['oot_head_csv'], index=None)
# datset files
dataset_csv_file = config['oot_head_csv']
dataset_jsonl_file = "oot_head_data.jsonl"
with open(dataset_csv_file, "r+") as infile, open(dataset_jsonl_file, "w+") as outfile:
reader = csv.DictReader(infile);
for row in reader:
# remove @
#row["inputs"] = row["inputs"].replace("@","")
json.dump(row, outfile);
outfile.write('\n');
input_s3_path = s3_path_join("s3://", sagemaker_session_bucket, "batch_transform/input_head")
output_s3_path = s3_path_join("s3://", sagemaker_session_bucket, "batch_transform/output_head")
s3_file_uri_head = S3Uploader.upload(dataset_jsonl_file, input_s3_path)
print(f"{dataset_jsonl_file} uploaded to {s3_file_uri}")
# create Hugging Face Model Class for classifier
huggingface_model = HuggingFaceModel(
model_data =model_uri, # configuration for loading model from Hub
role=role, # iam role with permissions to create an Endpoint
transformers_version="4.6", # transformers version used
pytorch_version="1.7", # pytorch version used
py_version='py36', # python version used
)
# create Transformer to run our batch job
batch_job = huggingface_model.transformer(
instance_count=1,
instance_type='ml.g4dn.xlarge',
output_path=output_s3_path, # we are using the same s3 path to save the output with the input
strategy='SingleRecord'
)
batch_job.transform(data=s3_file_uri, content_type='application/json', split_type='Line')
I am new to HF and transformer libraries; it would be great if someone could help me out here with the easiest way to use the truncate option during batch prediction on a large dataset