Limit max # of tokens for inference in pipeline?

I’m following the first example for fine tuning a model, particularly I am tokenizing like so

# source is a dataset with text and label

tokenizer = AutoTokenizer.from_pretrained('bert-base-cased')
def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True)
train =
trainer = Trainer(

now I want to do some inference and I make a pipeline


then I pass a list of strings through it


and get the error

RuntimeError: The size of tensor a (560) must match the size of tensor b (512) at non-singleton dimension 1

because one of the strings in the array is too long and makes too many tokens. When I was doing inference I was able to apply the max_length as a parameter to the tokenizer when I tokenized with the call method, but I can’t see how to configure it so that it is done inside the pipeline. I’ve tried some other approaches to doing inference without using the pipeline but the documentation keeps sending me back. What should I do?