Which is actually used to configure scheduler in deepspeed and TrainingArguments?
|
|
0
|
93
|
May 17, 2024
|
How to control the GPU id for loading model weights when fintune Llama8B model with the Trainer?
|
|
0
|
76
|
May 17, 2024
|
Cannot import name 'WhisperForAudioClassification
|
|
0
|
160
|
May 16, 2024
|
Isn't KV cache influenced by position encoding in inference?
|
|
3
|
927
|
May 16, 2024
|
ModuleNotFoundError: No module named 'transformers.agents'
|
|
2
|
749
|
May 16, 2024
|
Why follow Flan-T5 template when T5 tokenizer ignores multiple newlines
|
|
0
|
114
|
May 15, 2024
|
Decoder only model - how to have it not include the prompt in its output?
|
|
3
|
664
|
May 15, 2024
|
ValueError: Unrecognized configuration class <class 'transformers.models.whisper.configuration_whisper.WhisperConfig'>
|
|
0
|
245
|
May 15, 2024
|
KeyError: 'eval_qwk' when used get_peft_model
|
|
0
|
112
|
May 14, 2024
|
Fedrated Learning using trainer
|
|
0
|
72
|
May 14, 2024
|
Next sentence prediction on custom model
|
|
3
|
3399
|
May 14, 2024
|
Llama 3 tokenizer prints cryptic message
|
|
0
|
158
|
May 13, 2024
|
Whisper Inference RuntimeError: The expanded size of the tensor (3000) must match the existing size (3392) at non-singleton dimension 1. Target sizes: [80, 3000]. Tensor sizes: [80, 3392]
|
|
1
|
749
|
May 13, 2024
|
HUBERT Implementation with increased vocabulary size
|
|
0
|
87
|
May 13, 2024
|
Model Parralelism approach in Llama Code looks like very inefficient
|
|
0
|
95
|
May 13, 2024
|
ValueError when training on a multi GPU setup and DPO
|
|
0
|
245
|
May 13, 2024
|
Transformer shifting output question
|
|
1
|
355
|
May 13, 2024
|
Not able to add data_collator to Trainer
|
|
1
|
636
|
May 13, 2024
|
How can we automatically run the script with a token included in a script
|
|
0
|
83
|
May 13, 2024
|
Index Error: Target {} is out of bounds
|
|
0
|
267
|
May 13, 2024
|
Load_in_8bit vs. loading 8-bit quantized model
|
|
6
|
6996
|
May 13, 2024
|
Convert Conv1D to nn.Linear
|
|
2
|
981
|
May 12, 2024
|
SFTrainer doesn't show added column
|
|
0
|
101
|
May 12, 2024
|
How can I keep use of the base model version for inference after fine-tuning
|
|
1
|
95
|
May 12, 2024
|
BartForConditionalGeneration: loss function diverges instead of converging
|
|
0
|
123
|
May 12, 2024
|
Beam search error
|
|
2
|
572
|
May 12, 2024
|
An error occurred: You have to specify input_ids
|
|
0
|
308
|
May 11, 2024
|
How to change max_length of a fine tuned model
|
|
4
|
11549
|
May 11, 2024
|
Phi3 Mini 4k Instruct Flash Attention not found
|
|
4
|
5153
|
May 11, 2024
|
LayoutLMv3 inference - bboxes are incorrect
|
|
0
|
120
|
May 10, 2024
|