Why is BART so much slower than T5?

When I run the following code here are my timed results (on a CPU):

tokenizer.decode(model.generate(tokenizer("""Many answers here mention tools that require installation, but nobody has mentioned that two of Ubuntu's scripting languages, Perl and Python, already come with all the necessary modules that allow you to unzip a zip archive, which means you don't need to install anything else.""", return_tensors = "pt")['input_ids'], min_length = 20, max_length = 20)[0])

t5-base (220M parameters): .78 seconds average
bart-large (400M parameters): 2.35 seconds average
t5-large (770M parameters): 2.78 seconds average

Why is bart-large so much slower (based on its relative parameter count) compared to both versions of T5?