Inference Hyperparameters

Interesting. What’s your Sagemaker version? Mine is 2.48.

I upgraded my sagemaker from2.48 to 2.59. I used the same code to redeploy my model and invoke the endpoint. This time I got a new error message -

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from model with message "{
“code”: 400,
“type”: “InternalServerException”,
“message”: “invalid type: boolean true, expected struct TruncationParams at line 1 column 34”
}

here is the server side error -

fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
2021-09-24 17:22:01,699 [INFO ] W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)

Exception: invalid type: boolean true, expected struct TruncationParams at line 1 column 34

Says you have an issue with your decoding. Can you please copy my snippet and test it with your endpoint_name.

@philschmid I copied your code and I got the same error. I think the implementation of the Truncation parameter changed in the latest SageMaker version. What’s your SageMaker version?

Here is the error with you code with my endpoint name -

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from model with message "{
“code”: 400,
“type”: “InternalServerException”,
“message”: “invalid type: boolean true, expected struct TruncationParams at line 1 column 34”
}

Have you made any other changes to the model or inference script? Can you please share exactly what you execute and do? The issue must be on your side somewhere since it perfectly works for the albert i tested.

I downloaded the same albert-base-v2-imdb model from huggingface. The truncation parameter worked.

However with the same deployment and prediction code, the truncation parameter didn’t with my model. No, I don’t have any custom pipeline code in the model.tar.gz file. I retrained an albert_xx_large, then fine-tuned this model.

Still the same error -
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from model with message "{
“code”: 400,
“type”: “InternalServerException”,
“message”: “The size of tensor a (577) must match the size of tensor b (512) at non-singleton dimension 1”
}
". See https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/sagemaker/Endpoints/huggingface-pytorch-inference-2021-10-06-00-23-07-036 in account 209338229909 for more information.

Okay, then we know it is not a SageMaker specific issue.

What happens if you load your model and tokenizer with from_pretrained in a notebook and try to use them with the pipelines and truncation=True?

And just to be sure can you share the request you sent with the predictor to your model here?

Testing with pipeline.

Here is my prediction code -

input_sentence = “2 EXCEPT IN THE CASE OF WILFUL MISCONDUCT OR NEGLIGENCE, NEITHER BANK(INCLUDING ITS DIRECTORS, AGENTS, EMPLOYEES OR SUB-CONTRACTORS) NOR ANY OF ITSSERVICE PROVIDERS SHALL BE LIABLE FOR ANY LOSS, DAMAGE OR CLAIM OF ANY KINDWHATSOEVER ARISING DIRECTLY OR INDIRECTLY AS A RESULT OF (1) CONTENT ON THESYSTEM; (2) ANY ERRORS IN OR OMISSIONS FROM THE SYSTEM; (3) USE OF OR ACCESS TOTHE SYSTEM; (4) CLIENT\u0092S INABILITY TO ACCESS OR USE THE SYSTEM FOR ANY REASON; (5)ANY FAILURE BY THE SYSTEM TO TRANSMIT, OR ANY DELAY IN THE TRANSMISSION OR THERECEIPT BY BANK OF ANY INSTRUCTIONS, ANY REJECTION OR NON-EXECUTION OF ANY.- 5 -INSTRUCTIONS OR ANY FAILURE OF THE SYSTEM TO TRANSMIT, OR ANY DELAY IN THETRANSMISSION OR THE RECEIPT BY THE CLIENT OF, ANY NOTIFICATION THAT ANYINSTRUCTIONS HAVE OR HAVE NOT BEEN EXECUTED; OR (6) ANY UNAUTHORISED ACCESS TOTHE SYSTEM OR ANY OTHER MEANS OF COMMUNICATION UTILISED BY BANK IN RELATION TOTHE SERVICES PROVIDED PURSUANT TO THE TERMS AND CONDITIONS.7.3 TO THE FULL EXTENT PERMITTED BY LAW, NEITHER BANK NOR ANY OF ITS SERVICEPROVIDERS SHALL BE LIABLE FOR ANY (1) LOSS OF PROFITS OR REVENUE OR SAVINGS OROTHER ECONOMIC LOSS, (2) LOSS OF BUSINESS OR GOODWILL, (3) LOSS OF OR DAMAGE TODATA, (4) INCIDENTAL OR SPECIAL LOSS, (5) WASTED OR LOST MANAGEMENT TIME, OR (6)INDIRECT OR CONSEQUENTIAL LOSS ARISING FROM CLIENTS USE OF OR ACCESS TO THESYSTEM EVEN IF ADVISED OF THE POSSIBILITY OF ANY SUCH LOSS OR DAMAGE OR IF SUCHLOSS OR DAMAGE WAS FORESEEABLE.7.4 TO THE FULL EXTENT PERMITTED BY LAW, BANK\u0092S AND ANY SERVICE PROVIDER\u0092S TOTALLIABILITY ARISING OUT OF OR IN CONNECTION WITH THE SYSTEM OR OTHERWISE UNDERTHESE TERMS AND CONDITIONS SHALL BE LIMITED TO THE SUM OF \u00e010000 (TEN THOUSANDPOUNDS STERLING) OR THE EQUIVALENT IN OTHER CURRENCY.7.5 NOTHING IN THESE TERMS AND CONDITIONS EXCLUDES OR LIMITS BANK\u0092S OR A SERVICEPROVIDER\u0092S LIABILITY FOR FRAUD OR FOR PERSONAL INJURY OR DEATH CAUSED BY BANK\u0092SNEGLIGENCE.7.6 NEITHER BANK NOR ANY OF ITS SERVICE PROVIDERS WILL BE LIABLE FOR ANY FAILURETO PERFORM ANY OBLIGATION UNDER THESE TERMS AND CONDITIONS OR FROM ANY DELAYIN THE PERFORMANCE THEREOF, DUE TO CAUSES BEYOND ITS REASONABLE CONTROL,INCLUDING INDUSTRIAL DISPUTES OF ANY NATURE, ACTS OF GOD, ACTS OF A PUBLIC ENEMY,ACTS OF GOVERNMENT, FAILURE OF TELECOMMUNICATIONS, EXCHANGE OR MARKETRULINGS OR SUSPENSION OF TRADING, SABOTAGE, PESTILENCE, TERRORISM, LIGHTNING ORELECTRO-MAGNETIC DISTURBANCES, EARTHQUAKE, FLOOD, FIRE OR OTHER CASUALTY.8.”

input= {“inputs”:input_sentence,
“parameters”: {‘truncation’: True}}

predictor.predict(input)

@philschmid can you share the pipeline code for model prediction? I haven’t found one yet. Thanks.

You can find the documentation and instructions here: Pipelines — transformers 4.11.3 documentation