Pipeline vs model.generate()

Hi,

The pipeline() API is created mostly for people who don’t care too much about the details of the underlying process, for people who just want to use a machine learning model without having to implement several details like pre- and postprocessing themselves. The pipeline API is created such that you get an easy-to-use abstraction over any ML model, which is great for inference. The SummarizationPipeline for instance uses generate() behind the scenes.

On the other hand, if you do care about the details, then it’s recommended to generate text yourself by calling generate() yourself and implement pre-and postprocessing yourself.

Also note that any text generation pipeline does provide a generate_kwargs argument, which means that technically you can forward any of the keyword arguments that generate() supports to the pipeline as well.

15 Likes