ClientErro:400 when using batch transformer for inference

miOmiO · January 7, 2022, 10:06pm

Hi everyone,
I try to do sentiment analysis on a bunch of data and follow the example notebook notebooks/sagemaker-notebook.ipynb at main · huggingface/notebooks · GitHub and below is my code:

from sagemaker.huggingface.model import HuggingFaceModel


hub = {
 'HF_MODEL_ID':'cardiffnlp/twitter-roberta-base-sentiment',
 'HF_TASK':'text-classification'
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
env=hub, 
role=role, 
transformers_version='4.6',
pytorch_version="1.7", 
py_version='py36', 
)

# create Transformer to run our batch job
batch_job = huggingface_model.transformer(
 instance_count=1,
 instance_type='ml.p3.2xlarge',
 output_path=output_s3_path,
 strategy='SingleRecord')

# starts batch transform job and uses s3 data as input
batch_job.transform(
 data=input_s3_path,
 content_type="application/json",    
 split_type="Line")

My input jsonl file is formatted as

The model successfully infer when I feed with first 10 rows (totally this data is about 6k rows)
but it throw errors when I expand to 100 rows.

I’m super new to sagemaker and hugging face , can anyone tell me what I’m missing ? Thank you

miOmiO · January 7, 2022, 10:11pm

sorry it’s still me, just attach more pics so the context might be better understood.

I heard BERT has limitation on text length so I truncated each line to 460 words

marshmellow77 · January 10, 2022, 11:14am

Hi miOmiO - did the truncation fix your problem? If not, would you mind sharing what error message you are getting?

miOmiO · January 10, 2022, 5:10pm

Hi @marshmellow77 , yes, truncation make some long text acceptable to the model, but not all of them,
even I specify the length to 460 words. I tried from 500 to 460. I have no idea if I should keep reducing this or other modification would help.

philschmid · January 10, 2022, 5:27pm

@miOmiO you have created a second thread: ClientError:400 when using batch transformer on sagemaker for inference
where i responded. Can this be closed? or those two different threads.

miOmiO · January 10, 2022, 7:47pm

Hi @philschmid , sure, please kindly close this post, I will update on that one. Thank you.

philschmid · January 11, 2022, 7:11am

Thanks for letting me know. Then here again the response

Hey @miOmiO,

Happy to help you here! to narrow down your issue. I think the first step would be to check if the dataset was created in the sample (notebooks/sagemaker-notebook.ipynb at master · huggingface/notebooks · GitHub ) works or if it also errors out.

Additionally, could you bump the version of the HuggingFaceModel to the latest one? For transformers_version that’s 4.12.3 and for pytorch_version its 1.9.1 maybe this already solves your issue. You can find the list of available containers here: Reference

Also worth testing is to replace your model with a different model, e.g. distilbert-base-uncased-finetuned-sst-2-english

miOmiO · January 11, 2022, 7:03pm

Hi @philschmid ,
Thank you for the reply! I check all three approaches :
1 The inputs format comply with the json format from sample notebook.
2 I use the latest version of of framework as suggested
3 Also switch to different model to see if it’s specific model issue.

It still prompts client error:400 , but I notice that as long as I truncated my each line of text to 60 words(randomly pick the number) , both two model work fine. Is this related to the text length limitation? If so, is there any way I can specify the length of text when building the batch job?

P.S. my data set is around 6k rows , some rows have more than 512 words/tokens.

marshmellow77 · January 11, 2022, 8:31pm

Hi @miOmiO - text length limitation could be an issue here. Note that the length refers to the number of tokens, not the number of words. Because BERT models generally use subword tokenization it can happen that one word is split into 2 or more tokens. That is why even reducing the number of words to 460 sometimes might throw an error.

To test this you could try to use the model row by row and see if the number of examples that fail correspond to the same ones in your batch job. If it is indeed the number of tokens that cause the model to fail you should be seeing an error message like "... sequence length is longer than the specified maximum sequence length for this model ..."

If this indeed the source of error then it might be easiest to truncate the input sequence of tokens after the tokenization (rather than the number of words before tokenization).

Hope that helps.

philschmid · January 12, 2022, 7:15am

The model cardiffnlp/twitter-roberta-base-sentiment don’t has a max len defined. I tried to reach out the authors but they haven’t responded. See Add `tokenizer_max_length` to `cardiffnlp/twitter-roberta-base-sentiment` · Issue #13459 · huggingface/transformers · GitHub

You could “fork” the model → creating a new model repository and push the weights + add the tokenizer_config then truncation:True should work properly.

miOmiO · January 13, 2022, 4:47am

Hi @marshmellow77 , thank you for leading me into thought of sub-word tokenization .

miOmiO · January 13, 2022, 4:52am

Hi @philschmid , I got solution from another related post under your help as well :
How are the inputs tokenized when model deployment? - Amazon SageMaker - Hugging Face Forums

After I switch to another model and remodify the input json file format like this:
{“inputs”: “long sentence 2”, “parameters”: {“trucation”: true}}, the new model works well for me ( as long as it has ‘max_length’ attribute in tokenizer config file.)

Topic		Replies	Views
ClientError:400 when using batch transformer on sagemaker for inference Amazon SageMaker	3	2037	January 11, 2022
Errors while running a sagemaker batch transform (inference) job Beginners	2	1075	May 15, 2023
Batch transform inference job - downloading model from the Hugging Face Hub on start up Amazon SageMaker	2	1539	October 12, 2021
Endpoint Deployment Amazon SageMaker	1	1109	September 20, 2021
Sagemaker MultiRecord Inference Not Completing Amazon SageMaker	0	94	June 21, 2024

ClientErro:400 when using batch transformer for inference

Related topics