Where to set the batch size for text generation?

I trained a model and now I’m trying to generate data using batches.

I have the following script, and I always run out of memory.

Where do I change the batch size parameter? (i.e. how many texts to decode at a time?)

or is there another way to decode a long list of input texts?

batch = tokenizer(
    df['original_txt'].tolist(), 
    truncation=True,
    padding='longest',
    max_length=80, 
    return_tensors="pt"
    ).to(device)

generated = model.generate(
    **batch, 
    max_length=80, 
    no_repeat_ngram_size = 3
    )

#generates the full output list of all results
derived_summaries = tokenizer.batch_decode(generated, skip_special_tokens=True)