ValueError: The model is quantized with QuantizationMethod.QUANTO and is not serializable
|
|
0
|
11
|
April 28, 2024
|
How to avoid `trust_remote_code=True` for my models
|
|
3
|
30
|
April 28, 2024
|
Negative Kl values during PPO training (TRL library)
|
|
0
|
13
|
April 28, 2024
|
DPOTrainer and sequence length
|
|
0
|
15
|
April 27, 2024
|
ValueError: attention_mask is missing in the dataloader
|
|
0
|
23
|
April 27, 2024
|
Training Reformer model from scratch with deepspeed - backprop error
|
|
0
|
24
|
April 26, 2024
|
What should I do if I want to use local dataset xsum in this project
|
|
0
|
29
|
April 26, 2024
|
How to output loss from model.generate()?
|
|
15
|
4054
|
April 26, 2024
|
Train T5 from scratch
|
|
4
|
3034
|
April 26, 2024
|
(Audio-to-audio models) Should I use 2 models sequentially or create 1 model for attempting to make a music to music model?
|
|
0
|
19
|
April 26, 2024
|
Does generate's max_length influence training?
|
|
0
|
30
|
April 25, 2024
|
Finetuned State-space/mamba model not working on huggingface model
|
|
0
|
24
|
April 25, 2024
|
DPOTrainer consumes lots of VRAM
|
|
0
|
28
|
April 25, 2024
|
How to evaluate before first training step?
|
|
8
|
3633
|
April 25, 2024
|
Error to import transformers[torch] or accelerate -U
|
|
0
|
30
|
April 25, 2024
|
Getting error - trainer.train()
|
|
3
|
370
|
April 25, 2024
|
Prohibitively large RAM consumption on Trainer validation
|
|
2
|
80
|
April 24, 2024
|
ValueError: Mixed precision training with AMP or APEX (`--fp16`) and FP16 evaluation can only be used on CUDA devices
|
|
9
|
19954
|
April 24, 2024
|
Multiple time fine-tuning VideoMAE model adding n class each time
|
|
0
|
34
|
April 24, 2024
|
How to not show the progress bar for evaluation only?
|
|
1
|
67
|
April 24, 2024
|
How to disable Huggingface Hub during Trainer saving of PEFT models?
|
|
2
|
88
|
April 24, 2024
|
Multivariate time-series transformer
|
|
0
|
38
|
April 24, 2024
|
TypeError: map() got an unexpected keyword argument 'num_proc'
|
|
0
|
45
|
April 24, 2024
|
Using Trainer class + 4/8 bit quantised model for prediction
|
|
0
|
43
|
April 24, 2024
|
Why the model loading of llama2 is so slow?
|
|
6
|
6021
|
April 24, 2024
|
Out of bounds Error in label conversion , two labels getting converted to 0 and 247
|
|
0
|
34
|
April 24, 2024
|
PerceiverIO Output Query Array Doubts
|
|
0
|
47
|
April 23, 2024
|
SSL Certificate Issue
|
|
5
|
10184
|
April 23, 2024
|
How to cluster words into semantic entities, when performing information extraction?
|
|
2
|
802
|
April 23, 2024
|
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!
|
|
24
|
67516
|
April 23, 2024
|