Looking for help converting transformers to ONNX with HF Optimum
|
|
0
|
279
|
November 9, 2023
|
Reasoning Distillation with Huggingface Trainer
|
|
0
|
244
|
November 8, 2023
|
An extra space appears before the entities recognised with RoBERTa fine-tuned for Token Classification
|
|
0
|
158
|
November 8, 2023
|
Adapter-transformers vs transformers
|
|
1
|
123
|
November 8, 2023
|
How can i implement custom model to use Seq2SeqTrainer class
|
|
0
|
444
|
November 8, 2023
|
The size of tensor a (146) must match the size of tensor b (1214) at non-singleton dimension 1
|
|
0
|
380
|
November 8, 2023
|
Rag model set up
|
|
0
|
697
|
November 7, 2023
|
Llama 2 10x slower than LLaMA 1
|
|
1
|
729
|
November 7, 2023
|
How can one visualize the Cross-Attention of a VisionEncoderDecoderModel?
|
|
2
|
2003
|
November 7, 2023
|
Should I use .map(processor) or define tokenizer=processor?
|
|
0
|
173
|
November 7, 2023
|
Convert OpenAI whisper transformer model to Quantized tflite model
|
|
1
|
2402
|
November 7, 2023
|
The num_return_sequences parameter in model.generate does not return unique outputs
|
|
0
|
392
|
November 6, 2023
|
Error with get_peft_model() and PromptTuningConfig
|
|
1
|
1551
|
November 6, 2023
|
I want to use bert model weight to train a gpt model how is that possible
|
|
0
|
151
|
November 4, 2023
|
Context window decreased after finetuning?
|
|
0
|
190
|
November 4, 2023
|
Fine tunning llama2 with multiple GPUs and Hugging face trainer
|
|
1
|
3495
|
November 3, 2023
|
Chatbot in offline mode using when using langchain.HuggingFaceImbeddings
|
|
0
|
4844
|
November 3, 2023
|
SegformerImageProcesser only supports uint8 masks
|
|
0
|
135
|
November 2, 2023
|
Shockingly Incorrect Evaluate Function in Huggingface API
|
|
1
|
168
|
November 2, 2023
|
0% accuracy when finetuning from certain models. [CLS] token embeddings not learned
|
|
1
|
614
|
November 2, 2023
|
Regarding the data input injected into transformer_xl or transformer models
|
|
0
|
88
|
November 2, 2023
|
Steraming Inference without TGI
|
|
0
|
352
|
November 2, 2023
|
Is it possible to evaluate generations/output while fine-tuning a LLM?
|
|
2
|
753
|
November 1, 2023
|
How to restrict training to one GPU if multiple are available, co
|
|
4
|
14470
|
November 1, 2023
|
Train LoRA adapters on Multiple Datasets in Parallel for llama7B
|
|
0
|
986
|
November 1, 2023
|
Abnormal large value of MobileBert's <cls> embed
|
|
0
|
123
|
November 1, 2023
|
Ä token inserted by ByteLevelBPETokenizer
|
|
0
|
565
|
November 1, 2023
|
Optimum-neuron example script fails on trainium instance
|
|
0
|
266
|
November 1, 2023
|
Beam Search without `model.generate`
|
|
0
|
205
|
November 1, 2023
|
Embedding layer or last hidden_hidden_state
|
|
0
|
215
|
November 1, 2023
|