BART model.generate() RuntimeError: Boolean value of Tensor with more than one value is ambiguous

I suspect I’m making some basic mistake here. I’m trying to finetune BART for a sequence to sequence generation task. I first ran into all the issues discussed here including the final unresolved one about the process getting killed during evaluation due to running out of RAM memory when using Trainer for evaluation. So I tried writing a custom script for inference and evaluation.
I’m working with Python 3.8.11 and transformers 4.17.0 and here is what I’m starting with:

import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

if __name__ == '__main__':
    model = AutoModelForSeq2SeqLM.from_pretrained("facebook/bart-base")
    tokenizer = AutoTokenizer.from_pretrained("facebook/bart-base", use_fast=True)
    input_texts = ["Input text 1", "Longer input text 2"]
    input_encodings = tokenizer(input_texts, padding=True, truncation=True)
    input_ids = torch.tensor(input_encodings["input_ids"])
    attention_mask = torch.tensor(input_encodings["attention_mask"])
    output_tokens = model.generate(input_ids, attention_mask)
    output_texts = tokenizer.decode(output_tokens)

I get the following error

Traceback (most recent call last):
  File "", line 11, in <module>
    output_tokens = model.generate(input_ids, attention_mask)
  File ".../lib/python3.8/site-packages/torch/autograd/", line 26, in decorate_context
    return func(*args, **kwargs)
  File ".../lib/python3.8/site-packages/transformers/", line 1133, in generate
    if input_ids.shape[-1] >= max_length:
RuntimeError: Boolean value of Tensor with more than one value is ambiguous

Any suggestions on resolving this would be appreciated.

A friend found the answer. The issue is that attention_mask is not the second argument of generate.
So the last two lines should be

    output_tokens = model.generate(input_ids=input_ids, attention_mask=attention_mask)
    output_texts = tokenizer.batch_decode(output_tokens)