Fitting huge models on multiple nodes
|
|
0
|
153
|
September 6, 2024
|
Getting sslcertverificationerror exception
|
|
0
|
113
|
September 6, 2024
|
How to fine-tune "openai-gpt" model for sequence classification?
|
|
3
|
1336
|
September 5, 2024
|
When using greedy decoding on a causal LM, how does `generate` handle tie-breaking between logits?
|
|
0
|
13
|
September 5, 2024
|
Why does `generate` in `LlamaForCausalLM` give me _slightly_ lower logits than __call__?
|
|
1
|
148
|
September 5, 2024
|
"Whatâs the Difference Between max_length and max_new_tokens?"
|
|
0
|
585
|
September 5, 2024
|
How to continue training with HuggingFace Trainer?
|
|
4
|
8495
|
September 5, 2024
|
Confused about max_length and max_new_tokens
|
|
7
|
35840
|
September 5, 2024
|
How to continue training a model from where it left off?
|
|
0
|
177
|
September 5, 2024
|
Flash attention has no effect on inference
|
|
7
|
14916
|
September 4, 2024
|
CPU faster than MacBook GPU for Summarization
|
|
0
|
61
|
September 4, 2024
|
Max_length parameter in T5
|
|
5
|
1209
|
September 4, 2024
|
`target_sizes` and `output.logits` do not align in `image_processor.post_process_object_detection`
|
|
0
|
48
|
September 3, 2024
|
Generate dataset for fine tuning on PDF(s)
|
|
6
|
3097
|
September 3, 2024
|
Successive Matryoshka training - Healthcare concepts
|
|
0
|
5
|
September 3, 2024
|
Positional Embeddings in Transformer Implementations
|
|
1
|
1764
|
September 3, 2024
|
Evaluation stuck at 0% when trying to finetune OD model
|
|
0
|
22
|
September 3, 2024
|
use_temp_dir=False in push_to_hub() triggers a file not found error
|
|
0
|
38
|
September 2, 2024
|
How to disable caching in .from_pretrained()
|
|
3
|
771
|
September 2, 2024
|
Choosing a hosting or endpoint option to run BART-CNN
|
|
0
|
16
|
September 2, 2024
|
Is it possible to add `system prompt` to Blenderbot?
|
|
1
|
330
|
September 2, 2024
|
Data collation: cannot understand the logics of the API
|
|
0
|
25
|
September 2, 2024
|
Difference Between Attention Mask and Causal Mask
|
|
1
|
6143
|
September 2, 2024
|
Chat Templates for BlenderBot
|
|
5
|
1192
|
September 2, 2024
|
How to make transformer (T5) for translation return n translation inferences?
|
|
2
|
36
|
September 2, 2024
|
Finetuning GPT model multiple times
|
|
1
|
105
|
September 2, 2024
|
Multi-label Classification
|
|
0
|
16
|
September 2, 2024
|
T5 for a multi-classification task with returning probabilities [0,1]
|
|
0
|
15
|
September 1, 2024
|
How to modify loss function in a seq2seq trainer?
|
|
1
|
218
|
August 31, 2024
|
How to properly instantiate an untrained model (model with randomly generated weights)
|
|
0
|
67
|
August 30, 2024
|