Error using 'MultiRecord' in batch transform

xiaocb · May 23, 2022, 11:40pm

Hi, I am running batch transform on Sagemaker using BERT model, and my input file is json lines format. If I select ‘MultiRecord’ in ‘batchStrategy’, It gives the following error:

“code”: 400,
“type”: “InternalServerException”,
“message”: Extra data: line 2 column 1 (char 296)"

How could I modify the inference.py or input file format or the batch transform code to fix this error? Any suggestion is welcome.

Thanks in advance!

Xiao

philschmid · May 24, 2022, 6:54am

You could use local_mode to develop locally and see what you receive as inputs when working with MultiRecrod.

Here is an example on how you can use local mode for regular endpoints: amazon-sagemaker-local-mode/pytorch_script_mode_local_model_inference.py at 84b08fd079cb810e0aa6059ecc75bae6f7f3f13d · aws-samples/amazon-sagemaker-local-mode · GitHub
But this should work for batch transform as well → instance_type='local' and local session.

AFAIK you will receive an array of “inputs” as data and not a dict anymore.

razido · May 29, 2022, 8:49am

xiaocb
Did you manage to solve this issue?
I am experiencing the same problem.
Thanks

Topic		Replies	Views
Sagemaker MultiRecord Inference Not Completing Amazon SageMaker	0	94	June 21, 2024
Running batch transform in Sagemaker on a Huggingface model from the Hub with parameters Beginners	2	1701	February 2, 2023
ClientError:400 when using batch transformer on sagemaker for inference Amazon SageMaker	3	2037	January 11, 2022
ClientErro:400 when using batch transformer for inference Amazon SageMaker	11	2220	January 13, 2022
[SOLVED] Error of input when requesting batch-transform job of zero-shot-text-classification on SageMaker Amazon SageMaker	1	257	March 20, 2024

Error using 'MultiRecord' in batch transform

Related topics