Loading a safetensors format model using Hugging Face Transformers
|
|
2
|
4781
|
September 13, 2023
|
Training Loss Sudden Spike After 8 Hours of pre-training a BERT Model
|
|
0
|
1136
|
September 13, 2023
|
Unlikely unchanged losse for multiple epochs
|
|
0
|
188
|
September 13, 2023
|
Continued (in-domain) Pre-training of BART
|
|
1
|
467
|
September 13, 2023
|
Can't understand the graphs logged by `wandb`
|
|
0
|
271
|
September 12, 2023
|
Huggingface hosting cost calculation
|
|
2
|
887
|
September 12, 2023
|
Unusal pattern of CUDA out of error when using hyperparameter search (optuna backend)
|
|
0
|
285
|
September 12, 2023
|
Default distributed strategy used in single-node multi-GPU env
|
|
0
|
120
|
September 12, 2023
|
How to instantiate Bart Decoder in a non causal way - PyTorch
|
|
0
|
156
|
September 11, 2023
|
ModuleNotFoundError when activating venv in CGI script
|
|
0
|
301
|
September 11, 2023
|
LoraConfig task_type
|
|
0
|
625
|
September 11, 2023
|
Mobilebert, training from scratch. Not seeing where loads the teacher
|
|
3
|
414
|
September 11, 2023
|
Trying to understand the paper on Chinese LLama
|
|
0
|
531
|
September 11, 2023
|
Eval with trainer not running with PEFT LoRA model
|
|
1
|
1631
|
September 10, 2023
|
SegFormer fine-tuned on personal dataset problem on loss computation
|
|
0
|
193
|
September 9, 2023
|
Model.generate generates way too long outputs
|
|
0
|
311
|
September 9, 2023
|
BPE tokenizers and spaces before words
|
|
4
|
26842
|
September 8, 2023
|
IndexError: index out of range in self
|
|
0
|
469
|
September 8, 2023
|
How to load model with .pth and avoid ponderous pytorch_model.bin
|
|
0
|
1938
|
September 8, 2023
|
Language Model Skips entire Sentence
|
|
0
|
219
|
September 8, 2023
|
Google colab disrupted when loading again to continue training, it run from begining. How i can fix it?
|
|
0
|
196
|
September 8, 2023
|
Failed to import transformers.trainer
|
|
0
|
3426
|
September 8, 2023
|
Accuracy drops using Gradient checkpointing
|
|
0
|
156
|
September 7, 2023
|
Host memory still occupied after huggingface model deleted
|
|
1
|
211
|
September 7, 2023
|
Learning rate with deepspeed is fixed despite lr set to auto
|
|
2
|
2205
|
September 6, 2023
|
Multi-label token classification
|
|
34
|
7788
|
September 6, 2023
|
Get multiple metrics when using the huggingface trainer
|
|
5
|
7627
|
September 6, 2023
|
Add custom constraint in generate()
|
|
0
|
581
|
September 6, 2023
|
Loading Weights from Customized Model
|
|
0
|
641
|
September 6, 2023
|
How to get timestamps for each word in a transcription
|
|
2
|
2100
|
September 5, 2023
|