Parallelize Mistral/ llama2 output
|
|
0
|
55
|
April 15, 2024
|
Can I use "AutoModel For Sequence Classification" class for generative models?
|
|
2
|
457
|
April 15, 2024
|
Looking for exploratory study / best practices for LoRA adapters config (LLM fine-tuning)
|
|
0
|
67
|
April 15, 2024
|
Access feature in custom compute_loss method
|
|
0
|
48
|
April 15, 2024
|
Trocr Model not utilising gpu even I am specified that
|
|
0
|
69
|
April 15, 2024
|
Import transformers fails; installation issue?
|
|
1
|
724
|
April 15, 2024
|
Fine-Tuning a Language Model with Data Extracted from Multiple PDFs for a Chat Interface
|
|
1
|
440
|
April 15, 2024
|
Hugging face course enough for understanding transformer and llm stuff
|
|
2
|
122
|
March 10, 2024
|
Logits from generate and model call different
|
|
0
|
75
|
April 14, 2024
|
When I try to use my fine-tuned Causal LM model to inference a prompt, I get nothing but the last word repeated multiple times
|
|
1
|
86
|
April 14, 2024
|
Solving error for mismatch tensor size
|
|
0
|
77
|
April 14, 2024
|
Padding options for LayoutLM processor
|
|
0
|
103
|
April 14, 2024
|
Help with Sparse LLM Implementation
|
|
0
|
63
|
April 14, 2024
|
Model for image regression
|
|
0
|
67
|
April 13, 2024
|
Inverse normalising entities in Whisper
|
|
2
|
637
|
April 13, 2024
|
Difference in model prediction before saving and after loafing
|
|
0
|
57
|
April 13, 2024
|
How to set up Trainer for a regression?
|
|
6
|
10060
|
April 13, 2024
|
Fine-tuning BERT with multiple classification heads
|
|
10
|
2272
|
January 19, 2024
|
Remove a named module from a pre-trained model
|
|
0
|
66
|
April 12, 2024
|
Mistral model generates the same embeddings for different input texts
|
|
2
|
65
|
April 12, 2024
|
Loss becomes nan
|
|
0
|
85
|
April 12, 2024
|
Caching encoder state for multiple encoder-decoder `.generate()` calls?
|
|
2
|
75
|
April 12, 2024
|
Trainner API is not working. Its complaining of numpy depreciation issues
|
|
0
|
53
|
April 11, 2024
|
Metrics for Training Set in Trainer
|
|
9
|
15896
|
April 11, 2024
|
RuntimeError: CUDA error: device-side assert triggered 4x10
|
|
0
|
68
|
April 11, 2024
|
Trainer RuntimeError: The size of tensor a (462) must match the size of tensor b (448) at non-singleton dimension 1
|
|
16
|
30914
|
April 11, 2024
|
How to properly UPCAST the model weights to float32?
|
|
2
|
102
|
April 11, 2024
|
Kosmos-2 Fine tuning
|
|
35
|
764
|
April 11, 2024
|
Shouldn't RobertaForCausalLM generate something?
|
|
8
|
1083
|
April 11, 2024
|
How many GB of RAM do I need to train DBRX?
|
|
2
|
105
|
April 11, 2024
|