Optimize AND quantize with Optimum
|
|
11
|
2824
|
February 10, 2024
|
Improving Quantization Accuracy for ONNX Models with Optimum
|
|
0
|
469
|
February 8, 2024
|
Can I convert llama 2 "Chat" model into onnx using llama/convert_to_onnx.py script?
|
|
3
|
1256
|
January 30, 2024
|
Audio classifier in TFLite format
|
|
0
|
459
|
January 25, 2024
|
Need advice for implementing Greedy Search for ORTModelForSeq2SeqLM
|
|
2
|
472
|
January 17, 2024
|
Optimum roberta base quantization model recall drop 10%
|
|
5
|
439
|
January 15, 2024
|
packaging.version.InvalidVersion: Invalid version: ' '
|
|
1
|
1000
|
January 10, 2024
|
How to convert Speech Encoder Decoder to onnx
|
|
1
|
567
|
January 10, 2024
|
Some nodes were not assigned to the preferred execution providers
|
|
1
|
1746
|
January 10, 2024
|
Can bloom-7b1 be fine tuned using gaudi 1?
|
|
12
|
883
|
January 9, 2024
|
Optimum warnings while quantizing
|
|
0
|
504
|
January 6, 2024
|
FlashAttention-2's 16 bit requirement
|
|
2
|
1607
|
December 26, 2023
|
How to configure ONNX models from Hugging Face to use model options in C++?
|
|
0
|
438
|
November 10, 2023
|
Is a wheel to be released with the 1.14.0 release?
|
|
1
|
362
|
November 7, 2023
|
Donut fine tuning question
|
|
0
|
1334
|
October 16, 2023
|
Optimum export ONNX failure
|
|
0
|
617
|
September 30, 2023
|
Improving Whisper for Inference
|
|
11
|
3171
|
September 20, 2023
|
Order between optimization and quantization
|
|
1
|
475
|
September 19, 2023
|
Optimum-Cli [ Task Manager Error ]
|
|
1
|
597
|
September 18, 2023
|
Custom data preparation for LayoutLM model
|
|
1
|
930
|
September 18, 2023
|
Static quantization of gpt2-style models with ORTQuantizer
|
|
3
|
774
|
September 18, 2023
|
How to Prune Transformer based Model?
|
|
2
|
3675
|
August 25, 2023
|
How to ensure that while running with llama2-70B, we use parallelism?
|
|
11
|
1500
|
August 22, 2023
|
How to load checkpoint shards with gaudi instead of cpu?
|
|
1
|
907
|
August 21, 2023
|
Error while Trying to run inference using gaudi on a finetuned llama2 model using habana repo
|
|
9
|
615
|
August 21, 2023
|
ORT CLI vs. Programmatic
|
|
12
|
1168
|
August 17, 2023
|
4 Bit quantization
|
|
4
|
509
|
August 11, 2023
|
Static quantization of activations for transformers
|
|
2
|
1286
|
August 11, 2023
|
Export a BetterTransformer to ONNX
|
|
3
|
2466
|
August 11, 2023
|
Exporting model wav2vec2 not supported?
|
|
3
|
983
|
August 10, 2023
|