Hi @kmfoda ,
Do you mind sharing why you want to achieve this ? It might help us understand the underlying issue. In general getting 1-1 on non deterministic generation is hard to get (specially on GPU where floating errors can creep up too). Internally we check for 1-step generation 1-1, but for long generations, drift can happen so we have to focus on the core issue which is, is the summarization good ? (i.e. does it contain the key elements we expect).
Pipeline (and therefore the API) makes no attempt to control the underlying seed. And it can be in arbitrary state (since the API is a long running job and might run other code at any given time).
The only causes I can think of variance unaccounted for are model.eval()
to disable any kind of dropout/batch norm issues. And the pipeline also uses with torch.inference_mode()
which deactivates gradients calculation at least (I have no clue if this impacts RNG state).
Does this help ?
Cheers,
Nicolas