Exporting Optimum Pipeline for Triton
|
|
1
|
771
|
August 20, 2022
|
Regarding Quantizing gpt2-xl, gpt2-large, &c
|
|
2
|
1189
|
August 10, 2022
|
Load pytorch trained model via optimum
|
|
5
|
2483
|
August 10, 2022
|
Support for Mpnet models
|
|
2
|
777
|
August 8, 2022
|
What does the decoder with past values means
|
|
1
|
1539
|
August 5, 2022
|
Recommended Approach for Distributed Inference
|
|
3
|
1778
|
August 1, 2022
|
Optimum & RoBERTa: how far can we trust a quantized model against its pytorch version?
|
|
10
|
2181
|
July 27, 2022
|
Symlink error when importing ORTSeqClass model via Pipeline
|
|
4
|
1565
|
July 22, 2022
|
Can not import classes ORTModelFor(...) in AWS Sagemaker
|
|
4
|
1084
|
July 14, 2022
|
Quantized Model size difference when using Optimum vs. Onnxruntime
|
|
3
|
1324
|
July 14, 2022
|
Use_auth_token and revision with the class ORTModelFor SequenceClassification?
|
|
1
|
911
|
July 5, 2022
|
Unexpected input data type
|
|
1
|
2595
|
June 29, 2022
|
Onnx Vs Optimum
|
|
1
|
1098
|
June 28, 2022
|
Pass CPU cores to speed up inference
|
|
1
|
2548
|
June 14, 2022
|
Quantization on customized model
|
|
1
|
1169
|
May 10, 2022
|
Optimum v1.1.0 breaking problems
|
|
1
|
1069
|
April 26, 2022
|