SFTTrainer Class
|
|
0
|
127
|
October 11, 2023
|
Onnx export functionality failure for facebook/opt-2.7b with optimum CLI
|
|
0
|
337
|
October 11, 2023
|
Pix2struct based model ddp code conversion
|
|
1
|
312
|
October 11, 2023
|
Text Input Sequence Error
|
|
2
|
1142
|
October 11, 2023
|
I want to implement ToT tree of thoughts framework by using open source langauge model
|
|
0
|
365
|
October 11, 2023
|
I want to perform conversational /dialogue summarization on customer agent data(call center). Which model should i fine tune or any pretrained model is available
|
|
1
|
554
|
October 11, 2023
|
Using Hugging Faceâs models on multiple computers
|
|
0
|
320
|
October 10, 2023
|
Flan-T5 with Tensorflow-Serving
|
|
0
|
417
|
October 9, 2023
|
How to minimize memory consume when loading from pretrained models?
|
|
0
|
347
|
October 9, 2023
|
How to load after calling trainer.model.push_to_hub() on a fine tuned model?
|
|
1
|
908
|
October 9, 2023
|
When using SGD: RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
|
|
0
|
1909
|
October 9, 2023
|
Intermediate features from a Huggingface pretrained model
|
|
0
|
335
|
October 8, 2023
|
Tried to download Mistral 7B but got an error message
|
|
3
|
13469
|
October 8, 2023
|
torch.cuda.OutOfMemoryError when evaluate while traning
|
|
0
|
515
|
October 8, 2023
|
Can I use sentence-transformers with tensorflow?
|
|
1
|
350
|
October 8, 2023
|
Trained a tokenizer from scratch but problem when loading
|
|
0
|
484
|
October 8, 2023
|
Qunatized model with LORA takes much more GPU memory than the un-quantized model with LORA for the (E-5-Large Embedding Transformer)
|
|
4
|
1781
|
October 8, 2023
|
TrainingArgument
|
|
3
|
8252
|
October 8, 2023
|
Customising pretrained SegFormer
|
|
4
|
1580
|
October 6, 2023
|
LiLT not returning words when ocr_=True
|
|
0
|
118
|
October 6, 2023
|
Test data size error in TimeSeriesTransformer
|
|
0
|
237
|
October 5, 2023
|
How to sample from the validation set when using Trainer?
|
|
4
|
1920
|
October 5, 2023
|
Jupyter notebook hangs when creating TrainingArguments
|
|
0
|
314
|
October 5, 2023
|
Finetuned llama7b model is 5x slower than hugingface raw model
|
|
2
|
1527
|
October 5, 2023
|
Info about insertion of "distillation_token" into the audio spectrogram transformer class
|
|
0
|
182
|
October 4, 2023
|
Compatibility of transformers version 4.11.1 with Python 3.11
|
|
0
|
2154
|
October 4, 2023
|
Speed up beam search for item generation
|
|
1
|
955
|
October 4, 2023
|
Sequence numerical clasification
|
|
1
|
932
|
October 3, 2023
|
Evaluating on MMLU while finetuning using Trainer
|
|
0
|
805
|
October 3, 2023
|
Example script for VideoMAEForPreTraining
|
|
0
|
146
|
October 3, 2023
|