How are the inputs tokenized when model deployment?

philschmid · September 2, 2021, 3:19pm

Training and Inference are two completely different things. You are using the same tokenizer, but not the same configuration.

First of all, it is not possible to predict a longer sentence than 512 with the model you use. Meaning you can either use a model, which supports a longer input sequence, e.g. longofrmer or you can truncate your inputs in advance, so sending only inputs smaller than < 512.

Additionally, you could send in the parameters key of your request configuration to automatically truncate any incoming sequence, meaning the inference pipeline would automatically cut after 512 tokens.

long_sentence = "...." # longer than 512 tokens
sentiment_input= {
   {'inputs':long_sentence,
    'parameters': {'truncation':True}
   }
predictor.predict(sentiment_input)

Topic		Replies	Views
Why use tokenizer in Trainer with Tokenized Data 🤗Transformers	4	647	September 12, 2024
ClientErro:400 when using batch transformer for inference Amazon SageMaker	11	2221	January 13, 2022
Help for inference.py code Amazon SageMaker	10	3990	March 8, 2022
Fine tune a BERT model in sagemaker using a custom dataset Beginners	0	742	November 19, 2021
Model inference on tokenized dataset 🤗Datasets	2	6264	March 22, 2023

How are the inputs tokenized when model deployment?

Related topics