[URGENT] Issues with Training RoBERTa Model for Text Prediction with Fill Mask Task
|
|
4
|
65
|
March 18, 2024
|
Seq2SeqTrainer produces error during validation when using T5
|
|
0
|
23
|
March 18, 2024
|
Unexpected input type after export
|
|
0
|
20
|
March 18, 2024
|
Mistral trouble when fine-tuning : Don't set pad_token_id = eos_token_id
|
|
0
|
24
|
March 18, 2024
|
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!
|
|
22
|
59452
|
March 18, 2024
|
Kosmos-2 Fine tuning
|
|
21
|
307
|
March 18, 2024
|
[Announcement] Generation: Get probabilities for generated output
|
|
55
|
21900
|
March 18, 2024
|
Buy Real and Authentic Documents Online
|
|
1
|
372
|
March 18, 2024
|
Correct way to save/load adapters and checkpoints in PEFT
|
|
0
|
31
|
March 18, 2024
|
ValueError: Expected input batch_size to match target batch_size in Token Classification
|
|
8
|
72
|
March 17, 2024
|
Can't find Keras.engine
|
|
1
|
39
|
March 17, 2024
|
Deepspeed zero-2 cpu offloading killing process = -9 error
|
|
1
|
597
|
March 17, 2024
|
Can't push model to model hub
|
|
1
|
53
|
March 17, 2024
|
Device while using pipeline
|
|
0
|
30
|
March 16, 2024
|
No instructions in documentationTo train a new IDEFICS model from scratch
|
|
0
|
35
|
March 16, 2024
|
Difference between AutoModelForCausalLM and peft_model.merge_and_unload() for a LoRA model during inference
|
|
1
|
309
|
March 16, 2024
|
Fine-tuning for translation with facebook mbart-large-50
|
|
1
|
1423
|
March 16, 2024
|
Tokenizer train_new_from_iterator hanging for several models
|
|
0
|
29
|
March 16, 2024
|
Trainer freezes/crashes after evaluation step
|
|
1
|
51
|
March 15, 2024
|
I am following a hugging face guide for fine tuning whisper but I run into error when training
|
|
0
|
39
|
March 15, 2024
|
Is it ok to have max_length greater than context_length of the model
|
|
0
|
37
|
March 15, 2024
|
Release timeline for 4.39.0 / mamba?
|
|
0
|
40
|
March 14, 2024
|
Error while using LILT model "index out of range in self"
|
|
5
|
454
|
March 14, 2024
|
Getting warning message on creation of WeightedLossTrainer object
|
|
0
|
113
|
March 14, 2024
|
Getting error in importing TFTrainer
|
|
0
|
67
|
March 14, 2024
|
Quantizing a model on M1 Mac for qlora
|
|
0
|
65
|
March 14, 2024
|
`seq_classif_dropout = 0.2` what is the use of adding dropout after the classification network
|
|
0
|
39
|
March 14, 2024
|
Further finetuning a LoRA finetuned CausalLM Model
|
|
7
|
2742
|
March 14, 2024
|
Conceptual question: Early loading of the model defeats the purpose of deepspeed!
|
|
0
|
43
|
March 14, 2024
|
How to fine-tune a Mistral-7B model for machine translation?
|
|
1
|
112
|
March 13, 2024
|