Inference speed between pipelines and Heads

Is there any difference in latency between the general class pipeline and a more task-specific implementation? For example, between these code blocks, is there a speed difference? What about with batching, does that change things?

from transformers import pipeline

generator = pipeline(model="gpt2")
generator("I can't believe you did such a ", do_sample=False)

# These parameters will return suggestions, and only the newly created text making it easier for prompting suggestions.
outputs = generator("My tart needs some")

The other method would be:

from transformers import GPT2Tokenizer, GPT2Model
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2Model.from_pretrained('gpt2')
text = "Replace me by any text you'd like."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)

Moreover, is there any records on the fastest way to run HF models generally? Should they be exported somewhere else for optimization?