How to use optimum with encoder-decoder models
|
|
1
|
590
|
October 16, 2022
|
Dynamic quantization problems
|
|
4
|
654
|
October 16, 2022
|
Transformers.onnx vs optimum.onnxruntime
|
|
1
|
427
|
September 12, 2022
|
How to optimize ONNX seq2seq model?
|
|
2
|
1153
|
August 25, 2022
|
Exporting Optimum Pipeline for Triton
|
|
1
|
519
|
August 20, 2022
|
Regarding Quantizing gpt2-xl, gpt2-large, &c
|
|
2
|
605
|
August 10, 2022
|
Load pytorch trained model via optimum
|
|
5
|
1547
|
August 10, 2022
|
Support for Mpnet models
|
|
2
|
565
|
August 8, 2022
|
What does the decoder with past values means
|
|
1
|
708
|
August 5, 2022
|
Recommended Approach for Distributed Inference
|
|
3
|
1178
|
August 1, 2022
|
Optimum & RoBERTa: how far can we trust a quantized model against its pytorch version?
|
|
10
|
1312
|
July 27, 2022
|
Symlink error when importing ORTSeqClass model via Pipeline
|
|
4
|
1087
|
July 22, 2022
|
Can not import classes ORTModelFor(...) in AWS Sagemaker
|
|
4
|
755
|
July 14, 2022
|
Quantized Model size difference when using Optimum vs. Onnxruntime
|
|
3
|
745
|
July 14, 2022
|
Use_auth_token and revision with the class ORTModelFor SequenceClassification?
|
|
1
|
684
|
July 5, 2022
|
Unexpected input data type
|
|
1
|
1718
|
June 29, 2022
|
InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Unexpected input data type. Actual: (tensor(int32)) , expected: (tensor(int64)
|
|
1
|
2230
|
June 29, 2022
|
Onnx Vs Optimum
|
|
1
|
778
|
June 28, 2022
|
Pass CPU cores to speed up inference
|
|
1
|
1328
|
June 14, 2022
|
Quantization on customized model
|
|
1
|
794
|
May 10, 2022
|
Optimum v1.1.0 breaking problems
|
|
1
|
835
|
April 26, 2022
|