Logits from generate and model call different
|
|
0
|
50
|
April 14, 2024
|
When I try to use my fine-tuned Causal LM model to inference a prompt, I get nothing but the last word repeated multiple times
|
|
1
|
57
|
April 14, 2024
|
Solving error for mismatch tensor size
|
|
0
|
45
|
April 14, 2024
|
Padding options for LayoutLM processor
|
|
0
|
37
|
April 14, 2024
|
Help with Sparse LLM Implementation
|
|
0
|
48
|
April 14, 2024
|
Model for image regression
|
|
0
|
47
|
April 13, 2024
|
Inverse normalising entities in Whisper
|
|
2
|
619
|
April 13, 2024
|
Difference in model prediction before saving and after loafing
|
|
0
|
44
|
April 13, 2024
|
How to set up Trainer for a regression?
|
|
6
|
9769
|
April 13, 2024
|
Fine-tuning BERT with multiple classification heads
|
|
10
|
2108
|
January 19, 2024
|
Remove a named module from a pre-trained model
|
|
0
|
51
|
April 12, 2024
|
Mistral model generates the same embeddings for different input texts
|
|
2
|
58
|
April 12, 2024
|
Loss becomes nan
|
|
0
|
58
|
April 12, 2024
|
Caching encoder state for multiple encoder-decoder `.generate()` calls?
|
|
2
|
59
|
April 12, 2024
|
Trainner API is not working. Its complaining of numpy depreciation issues
|
|
0
|
42
|
April 11, 2024
|
Metrics for Training Set in Trainer
|
|
9
|
15137
|
April 11, 2024
|
RuntimeError: CUDA error: device-side assert triggered 4x10
|
|
0
|
52
|
April 11, 2024
|
Trainer RuntimeError: The size of tensor a (462) must match the size of tensor b (448) at non-singleton dimension 1
|
|
16
|
29736
|
April 11, 2024
|
How to properly UPCAST the model weights to float32?
|
|
2
|
81
|
April 11, 2024
|
Kosmos-2 Fine tuning
|
|
35
|
699
|
April 11, 2024
|
Shouldn't RobertaForCausalLM generate something?
|
|
8
|
1051
|
April 11, 2024
|
How many GB of RAM do I need to train DBRX?
|
|
2
|
81
|
April 11, 2024
|
Tensor size error when generating embeddings for documents using pre-trained models
|
|
3
|
92
|
April 11, 2024
|
Search models by tokenizer
|
|
0
|
39
|
April 10, 2024
|
Fine-Tune LoRA adapter starting from existing adapter
|
|
1
|
153
|
April 10, 2024
|
NotImplementedError: Cannot copy out of meta tensor; no data!
|
|
2
|
5020
|
April 10, 2024
|
Seeking Clarification: Model Evaluation - Train and Val loss
|
|
3
|
80
|
April 10, 2024
|
Development status of huggingface/tflite-android-transformers and modern alternatives
|
|
0
|
64
|
April 10, 2024
|
Is LLaMA rotary embedding implementation correct?
|
|
5
|
2849
|
April 10, 2024
|
Exporting UDOP to ONNX fails
|
|
0
|
79
|
April 8, 2024
|