BartForConditionalGeneration: loss function diverges instead of converging
|
|
0
|
118
|
May 12, 2024
|
Beam search error
|
|
2
|
553
|
May 12, 2024
|
An error occurred: You have to specify input_ids
|
|
0
|
272
|
May 11, 2024
|
How to change max_length of a fine tuned model
|
|
4
|
11249
|
May 11, 2024
|
Phi3 Mini 4k Instruct Flash Attention not found
|
|
4
|
4990
|
May 11, 2024
|
LayoutLMv3 inference - bboxes are incorrect
|
|
0
|
111
|
May 10, 2024
|
How could I define a LogitsProcessorList with multi parameters?
|
|
0
|
88
|
May 10, 2024
|
Argmax of Generation Probabilities doesn't match with Generated Sequence Tokens
|
|
2
|
944
|
May 10, 2024
|
How could I fusion the logits from different models and then convert it to Token?
|
|
0
|
97
|
May 10, 2024
|
Finetune_rag.py won't save checkpoints
|
|
0
|
114
|
May 9, 2024
|
CLIP: The `backend_tokenizer` provided does not match the expected format
|
|
3
|
231
|
May 9, 2024
|
What does the `use_cache` in `generate` actually do?
|
|
1
|
2269
|
May 9, 2024
|
AWD-LSTM beats finetuned BERT as train ds decreases?! :person_shrugging:t4:
|
|
2
|
126
|
May 9, 2024
|
How to count how many forward passes were done in model.generate when using assistant_model
|
|
0
|
84
|
May 9, 2024
|
How to pass multiple datasets into Trainer for Knowledge distillation in NMT
|
|
3
|
334
|
May 9, 2024
|
Trainer doesn't show the loss at each step
|
|
20
|
35161
|
May 9, 2024
|
Lazy model initialization
|
|
3
|
927
|
May 8, 2024
|
Getting zero gradients for image patch embeddings when implementing GRADCAM for ViLT
|
|
0
|
91
|
May 8, 2024
|
Input to reshape is a tensor with 3763200 values, but the requested shape requires a multiple of 20384
|
|
0
|
86
|
May 8, 2024
|
Having multiple candidate labels in a zero shot classification model
|
|
3
|
573
|
May 8, 2024
|
Why eval_accumulation_steps takes so much memory
|
|
5
|
1430
|
May 8, 2024
|
Add metrics to object detection example
|
|
12
|
3838
|
May 8, 2024
|
Runtime error: NotImplementedError: Cannot copy out of meta tensor; no data!
|
|
0
|
1957
|
May 7, 2024
|
Llama-2 significantly slower than other models on huggingface
|
|
2
|
974
|
May 7, 2024
|
Retraining the SAM model on the color image database in order to segment multiple classes in the imageâ
|
|
0
|
346
|
May 7, 2024
|
Cuda Out of Memory when fine tuning llm model
|
|
3
|
1149
|
May 7, 2024
|
Lower Memory Usage for TF GPT-J
|
|
1
|
809
|
May 7, 2024
|
How to stream responses from AutoModelforCausalLM?
|
|
0
|
435
|
May 7, 2024
|
Fine tuning T5 Encoder and T5 Decoder separately
|
|
1
|
722
|
May 6, 2024
|
AttributeError: module 'torch' has no attribute 'chalf'
|
|
8
|
996
|
May 6, 2024
|