Different Summary Outputs Locally vs API for the Same Text

Narsil · December 3, 2021, 10:26am

Do you mind sharing why you want to achieve this ? It might help us understand the underlying issue. In general getting 1-1 on non deterministic generation is hard to get (specially on GPU where floating errors can creep up too). Internally we check for 1-step generation 1-1, but for long generations, drift can happen so we have to focus on the core issue which is, is the summarization good ? (i.e. does it contain the key elements we expect).

Pipeline (and therefore the API) makes no attempt to control the underlying seed. And it can be in arbitrary state (since the API is a long running job and might run other code at any given time).

The only causes I can think of variance unaccounted for are model.eval() to disable any kind of dropout/batch norm issues. And the pipeline also uses with torch.inference_mode() which deactivates gradients calculation at least (I have no clue if this impacts RNG state).

Does this help ?

Cheers,
Nicolas

Topic		Replies	Views
Pipeline vs model.generate() Beginners	11	14111	July 16, 2025
Difference API results and local transformer results Beginners	0	457	February 18, 2023
Summariser pipeline giving different results on same model with fixed seed Intermediate	0	871	August 17, 2022
Summarization pipeline 🤗Transformers	0	198	October 17, 2023
Sagemaker HuggingFaceModel crashed with CUDA error Amazon SageMaker	3	1260	February 20, 2025

Different Summary Outputs Locally vs API for the Same Text

Related topics