Getting ValueError when exporting model to ONNX using optimum

Hi @mineshj1291, the reason why I asked is that the current ORTModelForCausalLM doesn’t have with_past support so it will recompute the attention for the past sequence during generation. If your PyTorch model makes use of the precomputation, it is not so fair to compare it with the current ORTModelForCausalLM which does extra computation.

Btw, if you are interested, you can follow the PR for adding with_past support to ORTModelForCausalLM. I will try to finish it this week.

1 Like